1
0
mirror of https://github.com/gryf/.vim.git synced 2025-12-18 12:00:30 +01:00
Files
.vim/doc/py2stdlib.txt
gryf 0e3adcb81b Added L9 library (needed by FuzzyFinder)
Added python documentation as a vim help file
2010-11-15 21:37:51 +01:00

98922 lines
3.7 MiB

*py2stdlib.txt* For Vim version 7.0 Last change: 2010 Sep 01
==============================================================================
*py2stdlib*
PYTHON 2.7 STANDARD LIBRARY MODULES~
[ __SPECIAL__ ]~
__builtin__ .......................................... |py2stdlib-__builtin__|
Functions .............................. |py2stdlib-__builtin__:Functions|
Constants .............................. |py2stdlib-__builtin__:Constants|
Types ...................................... |py2stdlib-__builtin__:Types|
Exceptions ............................ |py2stdlib-__builtin__:Exceptions|
__future__ ............................................ |py2stdlib-__future__|
__main__ ................................................ |py2stdlib-__main__|
_winreg .................................................. |py2stdlib-_winreg|
[ A ]~
abc .......................................................... |py2stdlib-abc|
aepack .................................................... |py2stdlib-aepack|
aetools .................................................. |py2stdlib-aetools|
aetypes .................................................. |py2stdlib-aetypes|
aifc ........................................................ |py2stdlib-aifc|
al ............................................................ |py2stdlib-al|
AL ........................................................... |py2stdlib-al^|
anydbm .................................................... |py2stdlib-anydbm|
argparse ................................................ |py2stdlib-argparse|
array ...................................................... |py2stdlib-array|
ast .......................................................... |py2stdlib-ast|
asynchat ................................................ |py2stdlib-asynchat|
asyncore ................................................ |py2stdlib-asyncore|
atexit .................................................... |py2stdlib-atexit|
audioop .................................................. |py2stdlib-audioop|
autoGIL .................................................. |py2stdlib-autogil|
applesingle .......................................... |py2stdlib-applesingle|
[ B ]~
base64 .................................................... |py2stdlib-base64|
BaseHTTPServer .................................... |py2stdlib-basehttpserver|
Bastion .................................................. |py2stdlib-bastion|
bdb .......................................................... |py2stdlib-bdb|
binascii ................................................ |py2stdlib-binascii|
binhex .................................................... |py2stdlib-binhex|
bisect .................................................... |py2stdlib-bisect|
bsddb ...................................................... |py2stdlib-bsddb|
bz2 .......................................................... |py2stdlib-bz2|
buildtools ............................................ |py2stdlib-buildtools|
[ C ]~
calendar ................................................ |py2stdlib-calendar|
Carbon.AE .............................................. |py2stdlib-carbon.ae|
Carbon.AH .............................................. |py2stdlib-carbon.ah|
Carbon.App ............................................ |py2stdlib-carbon.app|
Carbon.Appearance .............................. |py2stdlib-carbon.appearance|
Carbon.CF .............................................. |py2stdlib-carbon.cf|
Carbon.CG .............................................. |py2stdlib-carbon.cg|
Carbon.CarbonEvt ................................ |py2stdlib-carbon.carbonevt|
Carbon.CarbonEvents .......................... |py2stdlib-carbon.carbonevents|
Carbon.Cm .............................................. |py2stdlib-carbon.cm|
Carbon.Components .............................. |py2stdlib-carbon.components|
Carbon.ControlAccessor .................... |py2stdlib-carbon.controlaccessor|
Carbon.Controls .................................. |py2stdlib-carbon.controls|
Carbon.CoreFounation ........................ |py2stdlib-carbon.corefounation|
Carbon.CoreGraphics .......................... |py2stdlib-carbon.coregraphics|
Carbon.Ctl ............................................ |py2stdlib-carbon.ctl|
Carbon.Dialogs .................................... |py2stdlib-carbon.dialogs|
Carbon.Dlg ............................................ |py2stdlib-carbon.dlg|
Carbon.Drag .......................................... |py2stdlib-carbon.drag|
Carbon.Dragconst ................................ |py2stdlib-carbon.dragconst|
Carbon.Events ...................................... |py2stdlib-carbon.events|
Carbon.Evt ............................................ |py2stdlib-carbon.evt|
Carbon.File .......................................... |py2stdlib-carbon.file|
Carbon.Files ........................................ |py2stdlib-carbon.files|
Carbon.Fm .............................................. |py2stdlib-carbon.fm|
Carbon.Folder ...................................... |py2stdlib-carbon.folder|
Carbon.Folders .................................... |py2stdlib-carbon.folders|
Carbon.Fonts ........................................ |py2stdlib-carbon.fonts|
Carbon.Help .......................................... |py2stdlib-carbon.help|
Carbon.IBCarbon .................................. |py2stdlib-carbon.ibcarbon|
Carbon.IBCarbonRuntime .................... |py2stdlib-carbon.ibcarbonruntime|
Carbon.Icns .......................................... |py2stdlib-carbon.icns|
Carbon.Icons ........................................ |py2stdlib-carbon.icons|
Carbon.Launch ...................................... |py2stdlib-carbon.launch|
Carbon.LaunchServices ...................... |py2stdlib-carbon.launchservices|
Carbon.List .......................................... |py2stdlib-carbon.list|
Carbon.Lists ........................................ |py2stdlib-carbon.lists|
Carbon.MacHelp .................................... |py2stdlib-carbon.machelp|
Carbon.MediaDescr .............................. |py2stdlib-carbon.mediadescr|
Carbon.Menu .......................................... |py2stdlib-carbon.menu|
Carbon.Menus ........................................ |py2stdlib-carbon.menus|
Carbon.Mlte .......................................... |py2stdlib-carbon.mlte|
Carbon.OSA ............................................ |py2stdlib-carbon.osa|
Carbon.OSAconst .................................. |py2stdlib-carbon.osaconst|
Carbon.QDOffscreen ............................ |py2stdlib-carbon.qdoffscreen|
Carbon.Qd .............................................. |py2stdlib-carbon.qd|
Carbon.Qdoffs ...................................... |py2stdlib-carbon.qdoffs|
Carbon.Qt .............................................. |py2stdlib-carbon.qt|
Carbon.QuickDraw ................................ |py2stdlib-carbon.quickdraw|
Carbon.QuickTime ................................ |py2stdlib-carbon.quicktime|
Carbon.Res ............................................ |py2stdlib-carbon.res|
Carbon.Resources ................................ |py2stdlib-carbon.resources|
Carbon.Scrap ........................................ |py2stdlib-carbon.scrap|
Carbon.Snd ............................................ |py2stdlib-carbon.snd|
Carbon.Sound ........................................ |py2stdlib-carbon.sound|
Carbon.TE .............................................. |py2stdlib-carbon.te|
Carbon.TextEdit .................................. |py2stdlib-carbon.textedit|
Carbon.Win ............................................ |py2stdlib-carbon.win|
Carbon.Windows .................................... |py2stdlib-carbon.windows|
cd ............................................................ |py2stdlib-cd|
cgi .......................................................... |py2stdlib-cgi|
CGIHTTPServer ...................................... |py2stdlib-cgihttpserver|
cgitb ...................................................... |py2stdlib-cgitb|
chunk ...................................................... |py2stdlib-chunk|
cmath ...................................................... |py2stdlib-cmath|
cmd .......................................................... |py2stdlib-cmd|
code ........................................................ |py2stdlib-code|
codecs .................................................... |py2stdlib-codecs|
codeop .................................................... |py2stdlib-codeop|
collections .......................................... |py2stdlib-collections|
ColorPicker .......................................... |py2stdlib-colorpicker|
colorsys ................................................ |py2stdlib-colorsys|
commands ................................................ |py2stdlib-commands|
compileall ............................................ |py2stdlib-compileall|
compiler ................................................ |py2stdlib-compiler|
compiler.ast ........................................ |py2stdlib-compiler.ast|
compiler.visitor ................................ |py2stdlib-compiler.visitor|
ConfigParser ........................................ |py2stdlib-configparser|
contextlib ............................................ |py2stdlib-contextlib|
Cookie .................................................... |py2stdlib-cookie|
cookielib .............................................. |py2stdlib-cookielib|
copy ........................................................ |py2stdlib-copy|
copy_reg ................................................ |py2stdlib-copy_reg|
crypt ...................................................... |py2stdlib-crypt|
csv .......................................................... |py2stdlib-csv|
ctypes .................................................... |py2stdlib-ctypes|
curses.ascii ........................................ |py2stdlib-curses.ascii|
curses.panel ........................................ |py2stdlib-curses.panel|
curses .................................................... |py2stdlib-curses|
curses.textpad .................................... |py2stdlib-curses.textpad|
curses.wrapper .................................... |py2stdlib-curses.wrapper|
cPickle .................................................. |py2stdlib-cpickle|
cProfile ................................................ |py2stdlib-cprofile|
cStringIO .............................................. |py2stdlib-cstringio|
cfmfile .................................................. |py2stdlib-cfmfile|
[ D ]~
datetime ................................................ |py2stdlib-datetime|
dbhash .................................................... |py2stdlib-dbhash|
dbm .......................................................... |py2stdlib-dbm|
decimal .................................................. |py2stdlib-decimal|
difflib .................................................. |py2stdlib-difflib|
dircache ................................................ |py2stdlib-dircache|
dis .......................................................... |py2stdlib-dis|
distutils .............................................. |py2stdlib-distutils|
dl ............................................................ |py2stdlib-dl|
doctest .................................................. |py2stdlib-doctest|
DocXMLRPCServer .................................. |py2stdlib-docxmlrpcserver|
dumbdbm .................................................. |py2stdlib-dumbdbm|
dummy_thread ........................................ |py2stdlib-dummy_thread|
dummy_threading .................................. |py2stdlib-dummy_threading|
DEVICE .................................................... |py2stdlib-device|
[ E ]~
encodings.idna .................................... |py2stdlib-encodings.idna|
encodings.utf_8_sig .......................... |py2stdlib-encodings.utf_8_sig|
EasyDialogs .......................................... |py2stdlib-easydialogs|
email.charset ...................................... |py2stdlib-email.charset|
email.encoders .................................... |py2stdlib-email.encoders|
email.errors ........................................ |py2stdlib-email.errors|
email.generator .................................. |py2stdlib-email.generator|
email.header ........................................ |py2stdlib-email.header|
email.iterators .................................. |py2stdlib-email.iterators|
email.message ...................................... |py2stdlib-email.message|
email.mime ............................................ |py2stdlib-email.mime|
email.parser ........................................ |py2stdlib-email.parser|
email ...................................................... |py2stdlib-email|
email.utils .......................................... |py2stdlib-email.utils|
errno ...................................................... |py2stdlib-errno|
exceptions ............................................ |py2stdlib-exceptions|
[ F ]~
fcntl ...................................................... |py2stdlib-fcntl|
filecmp .................................................. |py2stdlib-filecmp|
fileinput .............................................. |py2stdlib-fileinput|
fl ............................................................ |py2stdlib-fl|
FL ........................................................... |py2stdlib-fl^|
flp .......................................................... |py2stdlib-flp|
fm ............................................................ |py2stdlib-fm|
fnmatch .................................................. |py2stdlib-fnmatch|
formatter .............................................. |py2stdlib-formatter|
fpectl .................................................... |py2stdlib-fpectl|
fpformat ................................................ |py2stdlib-fpformat|
fractions .............................................. |py2stdlib-fractions|
FrameWork .............................................. |py2stdlib-framework|
ftplib .................................................... |py2stdlib-ftplib|
functools .............................................. |py2stdlib-functools|
future_builtins .................................. |py2stdlib-future_builtins|
findertools .......................................... |py2stdlib-findertools|
[ G ]~
gc ............................................................ |py2stdlib-gc|
gdbm ........................................................ |py2stdlib-gdbm|
gensuitemodule .................................... |py2stdlib-gensuitemodule|
getopt .................................................... |py2stdlib-getopt|
getpass .................................................. |py2stdlib-getpass|
gettext .................................................. |py2stdlib-gettext|
gl ............................................................ |py2stdlib-gl|
GL ........................................................... |py2stdlib-gl^|
glob ........................................................ |py2stdlib-glob|
grp .......................................................... |py2stdlib-grp|
gzip ........................................................ |py2stdlib-gzip|
[ H ]~
hashlib .................................................. |py2stdlib-hashlib|
heapq ...................................................... |py2stdlib-heapq|
hmac ........................................................ |py2stdlib-hmac|
hotshot .................................................. |py2stdlib-hotshot|
hotshot.stats ...................................... |py2stdlib-hotshot.stats|
htmllib .................................................. |py2stdlib-htmllib|
htmlentitydefs .................................... |py2stdlib-htmlentitydefs|
HTMLParser ............................................ |py2stdlib-htmlparser|
httplib .................................................. |py2stdlib-httplib|
[ I ]~
ic ............................................................ |py2stdlib-ic|
imageop .................................................. |py2stdlib-imageop|
imaplib .................................................. |py2stdlib-imaplib|
imgfile .................................................. |py2stdlib-imgfile|
imghdr .................................................... |py2stdlib-imghdr|
imp .......................................................... |py2stdlib-imp|
importlib .............................................. |py2stdlib-importlib|
imputil .................................................. |py2stdlib-imputil|
inspect .................................................. |py2stdlib-inspect|
io ............................................................ |py2stdlib-io|
itertools .............................................. |py2stdlib-itertools|
icopen .................................................... |py2stdlib-icopen|
[ J ]~
jpeg ........................................................ |py2stdlib-jpeg|
json ........................................................ |py2stdlib-json|
[ K ]~
keyword .................................................. |py2stdlib-keyword|
[ L ]~
lib2to3 .................................................. |py2stdlib-lib2to3|
linecache .............................................. |py2stdlib-linecache|
locale .................................................... |py2stdlib-locale|
logging .................................................. |py2stdlib-logging|
[ M ]~
MacOS ...................................................... |py2stdlib-macos|
macostools ............................................ |py2stdlib-macostools|
macpath .................................................. |py2stdlib-macpath|
mailbox .................................................. |py2stdlib-mailbox|
mailcap .................................................. |py2stdlib-mailcap|
marshal .................................................. |py2stdlib-marshal|
math ........................................................ |py2stdlib-math|
md5 .......................................................... |py2stdlib-md5|
mhlib ...................................................... |py2stdlib-mhlib|
mimetools .............................................. |py2stdlib-mimetools|
mimetypes .............................................. |py2stdlib-mimetypes|
MimeWriter ............................................ |py2stdlib-mimewriter|
mimify .................................................... |py2stdlib-mimify|
MiniAEFrame .......................................... |py2stdlib-miniaeframe|
mmap ........................................................ |py2stdlib-mmap|
modulefinder ........................................ |py2stdlib-modulefinder|
msilib .................................................... |py2stdlib-msilib|
msvcrt .................................................... |py2stdlib-msvcrt|
multifile .............................................. |py2stdlib-multifile|
multiprocessing .................................. |py2stdlib-multiprocessing|
multiprocessing.sharedctypes ........ |py2stdlib-multiprocessing.sharedctypes|
multiprocessing.managers ................ |py2stdlib-multiprocessing.managers|
multiprocessing.pool ........................ |py2stdlib-multiprocessing.pool|
multiprocessing.connection ............ |py2stdlib-multiprocessing.connection|
multiprocessing.dummy ...................... |py2stdlib-multiprocessing.dummy|
mutex ...................................................... |py2stdlib-mutex|
macerrors .............................................. |py2stdlib-macerrors|
macresource .......................................... |py2stdlib-macresource|
[ N ]~
netrc ...................................................... |py2stdlib-netrc|
new .......................................................... |py2stdlib-new|
nis .......................................................... |py2stdlib-nis|
nntplib .................................................. |py2stdlib-nntplib|
numbers .................................................. |py2stdlib-numbers|
Nav .......................................................... |py2stdlib-nav|
[ O ]~
operator ................................................ |py2stdlib-operator|
optparse ................................................ |py2stdlib-optparse|
os.path .................................................. |py2stdlib-os.path|
os ............................................................ |py2stdlib-os|
ossaudiodev .......................................... |py2stdlib-ossaudiodev|
[ P ]~
parser .................................................... |py2stdlib-parser|
pdb .......................................................... |py2stdlib-pdb|
pickle .................................................... |py2stdlib-pickle|
pickletools .......................................... |py2stdlib-pickletools|
pipes ...................................................... |py2stdlib-pipes|
pkgutil .................................................. |py2stdlib-pkgutil|
platform ................................................ |py2stdlib-platform|
plistlib ................................................ |py2stdlib-plistlib|
popen2 .................................................... |py2stdlib-popen2|
poplib .................................................... |py2stdlib-poplib|
posix ...................................................... |py2stdlib-posix|
posixfile .............................................. |py2stdlib-posixfile|
pprint .................................................... |py2stdlib-pprint|
profile .................................................. |py2stdlib-profile|
pstats .................................................... |py2stdlib-pstats|
pty .......................................................... |py2stdlib-pty|
pwd .......................................................... |py2stdlib-pwd|
py_compile ............................................ |py2stdlib-py_compile|
pyclbr .................................................... |py2stdlib-pyclbr|
pydoc ...................................................... |py2stdlib-pydoc|
PixMapWrapper ...................................... |py2stdlib-pixmapwrapper|
[ Q ]~
Queue ...................................................... |py2stdlib-queue|
quopri .................................................... |py2stdlib-quopri|
[ R ]~
random .................................................... |py2stdlib-random|
re ............................................................ |py2stdlib-re|
readline ................................................ |py2stdlib-readline|
repr ........................................................ |py2stdlib-repr|
resource ................................................ |py2stdlib-resource|
rexec ...................................................... |py2stdlib-rexec|
rfc822 .................................................... |py2stdlib-rfc822|
rlcompleter .......................................... |py2stdlib-rlcompleter|
robotparser .......................................... |py2stdlib-robotparser|
runpy ...................................................... |py2stdlib-runpy|
[ S ]~
sched ...................................................... |py2stdlib-sched|
ScrolledText ........................................ |py2stdlib-scrolledtext|
select .................................................... |py2stdlib-select|
sets ........................................................ |py2stdlib-sets|
sgmllib .................................................. |py2stdlib-sgmllib|
sha .......................................................... |py2stdlib-sha|
shelve .................................................... |py2stdlib-shelve|
shlex ...................................................... |py2stdlib-shlex|
shutil .................................................... |py2stdlib-shutil|
signal .................................................... |py2stdlib-signal|
SimpleHTTPServer ................................ |py2stdlib-simplehttpserver|
SimpleXMLRPCServer ............................ |py2stdlib-simplexmlrpcserver|
site ........................................................ |py2stdlib-site|
smtpd ...................................................... |py2stdlib-smtpd|
smtplib .................................................. |py2stdlib-smtplib|
sndhdr .................................................... |py2stdlib-sndhdr|
socket .................................................... |py2stdlib-socket|
SocketServer ........................................ |py2stdlib-socketserver|
spwd ........................................................ |py2stdlib-spwd|
sqlite3 .................................................. |py2stdlib-sqlite3|
ssl .......................................................... |py2stdlib-ssl|
stat ........................................................ |py2stdlib-stat|
statvfs .................................................. |py2stdlib-statvfs|
string .................................................... |py2stdlib-string|
StringIO ................................................ |py2stdlib-stringio|
stringprep ............................................ |py2stdlib-stringprep|
struct .................................................... |py2stdlib-struct|
subprocess ............................................ |py2stdlib-subprocess|
sunau ...................................................... |py2stdlib-sunau|
sunaudiodev .......................................... |py2stdlib-sunaudiodev|
SUNAUDIODEV ......................................... |py2stdlib-sunaudiodev^|
symbol .................................................... |py2stdlib-symbol|
symtable ................................................ |py2stdlib-symtable|
sys .......................................................... |py2stdlib-sys|
sysconfig .............................................. |py2stdlib-sysconfig|
syslog .................................................... |py2stdlib-syslog|
[ T ]~
tabnanny ................................................ |py2stdlib-tabnanny|
tarfile .................................................. |py2stdlib-tarfile|
telnetlib .............................................. |py2stdlib-telnetlib|
tempfile ................................................ |py2stdlib-tempfile|
termios .................................................. |py2stdlib-termios|
test ........................................................ |py2stdlib-test|
test.test_support .............................. |py2stdlib-test.test_support|
textwrap ................................................ |py2stdlib-textwrap|
thread .................................................... |py2stdlib-thread|
threading .............................................. |py2stdlib-threading|
time ........................................................ |py2stdlib-time|
timeit .................................................... |py2stdlib-timeit|
Tix .......................................................... |py2stdlib-tix|
Tkinter .................................................. |py2stdlib-tkinter|
token ...................................................... |py2stdlib-token|
tokenize ................................................ |py2stdlib-tokenize|
trace ...................................................... |py2stdlib-trace|
traceback .............................................. |py2stdlib-traceback|
ttk .......................................................... |py2stdlib-ttk|
tty .......................................................... |py2stdlib-tty|
turtle .................................................... |py2stdlib-turtle|
types ...................................................... |py2stdlib-types|
[ U ]~
unicodedata .......................................... |py2stdlib-unicodedata|
unittest ................................................ |py2stdlib-unittest|
urllib .................................................... |py2stdlib-urllib|
urllib2 .................................................. |py2stdlib-urllib2|
urlparse ................................................ |py2stdlib-urlparse|
user ........................................................ |py2stdlib-user|
UserDict ................................................ |py2stdlib-userdict|
UserList ................................................ |py2stdlib-userlist|
UserString ............................................ |py2stdlib-userstring|
uu ............................................................ |py2stdlib-uu|
uuid ........................................................ |py2stdlib-uuid|
[ V ]~
videoreader .......................................... |py2stdlib-videoreader|
[ W ]~
W .............................................................. |py2stdlib-w|
warnings ................................................ |py2stdlib-warnings|
wave ........................................................ |py2stdlib-wave|
weakref .................................................. |py2stdlib-weakref|
webbrowser ............................................ |py2stdlib-webbrowser|
whichdb .................................................. |py2stdlib-whichdb|
winsound ................................................ |py2stdlib-winsound|
wsgiref .................................................. |py2stdlib-wsgiref|
wsgiref.util ........................................ |py2stdlib-wsgiref.util|
wsgiref.headers .................................. |py2stdlib-wsgiref.headers|
wsgiref.simple_server ...................... |py2stdlib-wsgiref.simple_server|
wsgiref.validate ................................ |py2stdlib-wsgiref.validate|
wsgiref.handlers ................................ |py2stdlib-wsgiref.handlers|
[ X ]~
xml.parsers.expat .............................. |py2stdlib-xml.parsers.expat|
xdrlib .................................................... |py2stdlib-xdrlib|
xml.dom.minidom .................................. |py2stdlib-xml.dom.minidom|
xml.dom.pulldom .................................. |py2stdlib-xml.dom.pulldom|
xml.dom .................................................. |py2stdlib-xml.dom|
xml.etree.ElementTree ...................... |py2stdlib-xml.etree.elementtree|
xml.sax.handler .................................. |py2stdlib-xml.sax.handler|
xml.sax.xmlreader .............................. |py2stdlib-xml.sax.xmlreader|
xml.sax .................................................. |py2stdlib-xml.sax|
xml.sax.saxutils ................................ |py2stdlib-xml.sax.saxutils|
xmllib .................................................... |py2stdlib-xmllib|
xmlrpclib .............................................. |py2stdlib-xmlrpclib|
[ Z ]~
zipfile .................................................. |py2stdlib-zipfile|
zipimport .............................................. |py2stdlib-zipimport|
zlib ........................................................ |py2stdlib-zlib|
==============================================================================
*py2stdlib-builtin*
__builtin__~
:synopsis: The module that provides the built-in namespace.
This module provides direct access to all 'built-in' identifiers of Python; for
example, ``__builtin__.open`` is the full name for the built-in function
open.
This module is not normally accessed explicitly by most applications, but can be
useful in modules that provide objects with the same name as a built-in value,
but in which the built-in of that name is also needed. For example, in a module
that wants to implement an open function that wraps the built-in
open, this module can be used directly:: >
import __builtin__
def open(path):
f = __builtin__.open(path, 'r')
return UpperCaser(f)
class UpperCaser:
'''Wrapper around a file that converts output to upper-case.'''
def __init__(self, f):
self._f = f
def read(self, count=-1):
return self._f.read(count).upper()
# ...
<
.. impl-detail::
Most modules have the name ``__builtins__`` (note the ``'s'``) made available
as part of their globals. The value of ``__builtins__`` is normally either
this module or the value of this modules's __dict__ attribute. Since
this is an implementation detail, it may not be used by alternate
implementations of Python.
*py2stdlib-builtin:Functions*
Functions~
Built-in Functions
==================
The Python interpreter has a number of functions built into it that are always
available. They are listed here in alphabetical order.
abs(x)~
Return the absolute value of a number. The argument may be a plain or long
integer or a floating point number. If the argument is a complex number, its
magnitude is returned.
all(iterable)~
Return True if all elements of the {iterable} are true (or if the iterable
is empty). Equivalent to:: >
def all(iterable):
for element in iterable:
if not element:
return False
return True
<
.. versionadded:: 2.5
any(iterable)~
Return True if any element of the {iterable} is true. If the iterable
is empty, return False. Equivalent to:: >
def any(iterable):
for element in iterable:
if element:
return True
return False
<
.. versionadded:: 2.5
basestring()~
This abstract type is the superclass for str and unicode. It
cannot be called or instantiated, but it can be used to test whether an object
is an instance of str or unicode. ``isinstance(obj,
basestring)`` is equivalent to ``isinstance(obj, (str, unicode))``.
.. versionadded:: 2.3
bin(x)~
Convert an integer number to a binary string. The result is a valid Python
expression. If {x} is not a Python int object, it has to define an
__index__ method that returns an integer.
.. versionadded:: 2.6
bool([x])~
Convert a value to a Boolean, using the standard truth testing procedure. If
{x} is false or omitted, this returns False; otherwise it returns
True. bool is also a class, which is a subclass of
int. Class bool cannot be subclassed further. Its only
instances are False and True.
.. index:: pair: Boolean; type
.. versionadded:: 2.2.1
.. versionchanged:: 2.3
If no argument is given, this function returns False.
callable(object)~
Return True if the {object} argument appears callable,
False if not. If this
returns true, it is still possible that a call fails, but if it is false,
calling {object} will never succeed. Note that classes are callable (calling a
class returns a new instance); class instances are callable if they have a
__call__ method.
chr(i)~
Return a string of one character whose ASCII code is the integer {i}. For
example, ``chr(97)`` returns the string ``'a'``. This is the inverse of
ord. The argument must be in the range [0..255], inclusive;
ValueError will be raised if {i} is outside that range. See
also unichr.
classmethod(function)~
Return a class method for {function}.
A class method receives the class as implicit first argument, just like an
instance method receives the instance. To declare a class method, use this
idiom:: >
class C:
@classmethod
def f(cls, arg1, arg2, ...): ...
<
The ``@classmethod`` form is a function decorator -- see the description
of function definitions in function for details.
It can be called either on the class (such as ``C.f()``) or on an instance (such
as ``C().f()``). The instance is ignored except for its class. If a class
method is called for a derived class, the derived class object is passed as the
implied first argument.
Class methods are different than C++ or Java static methods. If you want those,
see staticmethod in this section.
For more information on class methods, consult the documentation on the standard
type hierarchy in types (|py2stdlib-types|).
.. versionadded:: 2.2
.. versionchanged:: 2.4
Function decorator syntax added.
cmp(x, y)~
Compare the two objects {x} and {y} and return an integer according to the
outcome. The return value is negative if ``x < y``, zero if ``x == y`` and
strictly positive if ``x > y``.
compile(source, filename, mode[, flags[, dont_inherit]])~
Compile the {source} into a code or AST object. Code objects can be executed
by an exec statement or evaluated by a call to eval.
{source} can either be a string or an AST object. Refer to the ast (|py2stdlib-ast|)
module documentation for information on how to work with AST objects.
The {filename} argument should give the file from which the code was read;
pass some recognizable value if it wasn't read from a file (``'<string>'`` is
commonly used).
The {mode} argument specifies what kind of code must be compiled; it can be
``'exec'`` if {source} consists of a sequence of statements, ``'eval'`` if it
consists of a single expression, or ``'single'`` if it consists of a single
interactive statement (in the latter case, expression statements that
evaluate to something other than ``None`` will be printed).
The optional arguments {flags} and {dont_inherit} control which future
statements (see 236) affect the compilation of {source}. If neither
is present (or both are zero) the code is compiled with those future
statements that are in effect in the code that is calling compile. If the
{flags} argument is given and {dont_inherit} is not (or is zero) then the
future statements specified by the {flags} argument are used in addition to
those that would be used anyway. If {dont_inherit} is a non-zero integer then
the {flags} argument is it -- the future statements in effect around the call
to compile are ignored.
Future statements are specified by bits which can be bitwise ORed together to
specify multiple statements. The bitfield required to specify a given feature
can be found as the compiler_flag attribute on the _Feature
instance in the __future__ (|py2stdlib-__future__|) module.
This function raises SyntaxError if the compiled source is invalid,
and TypeError if the source contains null bytes.
.. note:: >
When compiling a string with multi-line code in ``'single'`` or
``'eval'`` mode, input must be terminated by at least one newline
character. This is to facilitate detection of incomplete and complete
statements in the code (|py2stdlib-code|) module.
<
.. versionchanged:: 2.3
The {flags} and {dont_inherit} arguments were added.
.. versionchanged:: 2.6
Support for compiling AST objects.
.. versionchanged:: 2.7
Allowed use of Windows and Mac newlines. Also input in ``'exec'`` mode
does not have to end in a newline anymore.
complex([real[, imag]])~
Create a complex number with the value {real} + {imag}\*j or convert a string or
number to a complex number. If the first parameter is a string, it will be
interpreted as a complex number and the function must be called without a second
parameter. The second parameter can never be a string. Each argument may be any
numeric type (including complex). If {imag} is omitted, it defaults to zero and
the function serves as a numeric conversion function like int,
long and float. If both arguments are omitted, returns ``0j``.
The complex type is described in typesnumeric.
delattr(object, name)~
This is a relative of setattr. The arguments are an object and a
string. The string must be the name of one of the object's attributes. The
function deletes the named attribute, provided the object allows it. For
example, ``delattr(x, 'foobar')`` is equivalent to ``del x.foobar``.
dict([arg])~
Create a new data dictionary, optionally with items taken from {arg}.
The dictionary type is described in typesmapping.
For other containers see the built in list, set, and
tuple classes, and the collections (|py2stdlib-collections|) module.
dir([object])~
Without arguments, return the list of names in the current local scope. With an
argument, attempt to return a list of valid attributes for that object.
If the object has a method named __dir__, this method will be called and
must return the list of attributes. This allows objects that implement a custom
__getattr__ or __getattribute__ function to customize the way
dir reports their attributes.
If the object does not provide __dir__, the function tries its best to
gather information from the object's __dict__ attribute, if defined, and
from its type object. The resulting list is not necessarily complete, and may
be inaccurate when the object has a custom __getattr__.
The default dir mechanism behaves differently with different types of
objects, as it attempts to produce the most relevant, rather than complete,
information:
* If the object is a module object, the list contains the names of the module's
attributes.
* If the object is a type or class object, the list contains the names of its
attributes, and recursively of the attributes of its bases.
* Otherwise, the list contains the object's attributes' names, the names of its
class's attributes, and recursively of the attributes of its class's base
classes.
The resulting list is sorted alphabetically. For example:
>>> import struct
>>> dir() # doctest: +SKIP
['__builtins__', '__doc__', '__name__', 'struct']
>>> dir(struct) # doctest: +NORMALIZE_WHITESPACE
['Struct', '__builtins__', '__doc__', '__file__', '__name__',
'__package__', '_clearcache', 'calcsize', 'error', 'pack', 'pack_into',
'unpack', 'unpack_from']
>>> class Foo(object):
... def __dir__(self):
... return ["kan", "ga", "roo"]
...
>>> f = Foo()
>>> dir(f)
['ga', 'kan', 'roo']
.. note:: >
Because dir is supplied primarily as a convenience for use at an
interactive prompt, it tries to supply an interesting set of names more than it
tries to supply a rigorously or consistently defined set of names, and its
detailed behavior may change across releases. For example, metaclass attributes
are not in the result list when the argument is a class.
<
divmod(a, b)~
Take two (non complex) numbers as arguments and return a pair of numbers
consisting of their quotient and remainder when using long division. With mixed
operand types, the rules for binary arithmetic operators apply. For plain and
long integers, the result is the same as ``(a // b, a % b)``. For floating point
numbers the result is ``(q, a % b)``, where {q} is usually ``math.floor(a / b)``
but may be 1 less than that. In any case ``q * b + a % b`` is very close to
{a}, if ``a % b`` is non-zero it has the same sign as {b}, and ``0 <= abs(a % b)
< abs(b)``.
.. versionchanged:: 2.3
Using divmod with complex numbers is deprecated.
enumerate(sequence[, start=0])~
Return an enumerate object. {sequence} must be a sequence, an
iterator, or some other object which supports iteration. The
!next method of the iterator returned by enumerate returns a
tuple containing a count (from {start} which defaults to 0) and the
corresponding value obtained from iterating over {iterable}.
enumerate is useful for obtaining an indexed series: ``(0, seq[0])``,
``(1, seq[1])``, ``(2, seq[2])``, .... For example:
>>> for i, season in enumerate(['Spring', 'Summer', 'Fall', 'Winter']):
... print i, season
0 Spring
1 Summer
2 Fall
3 Winter
.. versionadded:: 2.3
.. versionadded:: 2.6
The {start} parameter.
eval(expression[, globals[, locals]])~
The arguments are a string and optional globals and locals. If provided,
{globals} must be a dictionary. If provided, {locals} can be any mapping
object.
.. versionchanged:: 2.4
formerly {locals} was required to be a dictionary.
The {expression} argument is parsed and evaluated as a Python expression
(technically speaking, a condition list) using the {globals} and {locals}
dictionaries as global and local namespace. If the {globals} dictionary is
present and lacks '__builtins__', the current globals are copied into {globals}
before {expression} is parsed. This means that {expression} normally has full
access to the standard builtin (|py2stdlib-builtin|) module and restricted environments are
propagated. If the {locals} dictionary is omitted it defaults to the {globals}
dictionary. If both dictionaries are omitted, the expression is executed in the
environment where eval is called. The return value is the result of
the evaluated expression. Syntax errors are reported as exceptions. Example:
>>> x = 1
>>> print eval('x+1')
2
This function can also be used to execute arbitrary code objects (such as
those created by compile). In this case pass a code object instead
of a string. If the code object has been compiled with ``'exec'`` as the
{mode} argument, eval\'s return value will be ``None``.
Hints: dynamic execution of statements is supported by the exec
statement. Execution of statements from a file is supported by the
execfile function. The globals and locals functions
returns the current global and local dictionary, respectively, which may be
useful to pass around for use by eval or execfile.
execfile(filename[, globals[, locals]])~
This function is similar to the exec statement, but parses a file
instead of a string. It is different from the import statement in
that it does not use the module administration --- it reads the file
unconditionally and does not create a new module. [#]_
The arguments are a file name and two optional dictionaries. The file is parsed
and evaluated as a sequence of Python statements (similarly to a module) using
the {globals} and {locals} dictionaries as global and local namespace. If
provided, {locals} can be any mapping object.
.. versionchanged:: 2.4
formerly {locals} was required to be a dictionary.
If the {locals} dictionary is omitted it defaults to the {globals} dictionary.
If both dictionaries are omitted, the expression is executed in the environment
where execfile is called. The return value is ``None``.
.. note:: >
The default {locals} act as described for function locals below:
modifications to the default {locals} dictionary should not be attempted. Pass
an explicit {locals} dictionary if you need to see effects of the code on
{locals} after function execfile returns. execfile cannot be
used reliably to modify a function's locals.
<
file(filename[, mode[, bufsize]])~
Constructor function for the file type, described further in section
bltin-file-objects. The constructor's arguments are the same as those
of the open built-in function described below.
When opening a file, it's preferable to use open instead of invoking
this constructor directly. file is more suited to type testing (for
example, writing ``isinstance(f, file)``).
.. versionadded:: 2.2
filter(function, iterable)~
Construct a list from those elements of {iterable} for which {function} returns
true. {iterable} may be either a sequence, a container which supports
iteration, or an iterator. If {iterable} is a string or a tuple, the result
also has that type; otherwise it is always a list. If {function} is ``None``,
the identity function is assumed, that is, all elements of {iterable} that are
false are removed.
Note that ``filter(function, iterable)`` is equivalent to ``[item for item in
iterable if function(item)]`` if function is not ``None`` and ``[item for item
in iterable if item]`` if function is ``None``.
See itertools.ifilterfalse for the complementary function that returns
elements of {iterable} for which {function} returns false.
float([x])~
Convert a string or a number to floating point. If the argument is a string, it
must contain a possibly signed decimal or floating point number, possibly
embedded in whitespace. The argument may also be [+|-]nan or [+|-]inf.
Otherwise, the argument may be a plain or long integer
or a floating point number, and a floating point number with the same value
(within Python's floating point precision) is returned. If no argument is
given, returns ``0.0``.
.. note:: >
.. index::
single: NaN
single: Infinity
When passing in a string, values for NaN and Infinity may be returned, depending
on the underlying C library. Float accepts the strings nan, inf and -inf for
NaN and positive or negative infinity. The case and a leading + are ignored as
well as a leading - is ignored for NaN. Float always represents NaN and infinity
as nan, inf or -inf.
<
The float type is described in typesnumeric.
format(value[, format_spec])~
.. index::
pair: str; format
single: __format__
Convert a {value} to a "formatted" representation, as controlled by
{format_spec}. The interpretation of {format_spec} will depend on the type
of the {value} argument, however there is a standard formatting syntax that
is used by most built-in types: formatspec.
.. note:: >
``format(value, format_spec)`` merely calls
``value.__format__(format_spec)``.
<
.. versionadded:: 2.6
frozenset([iterable])~
Return a frozenset object, optionally with elements taken from {iterable}.
The frozenset type is described in types-set.
For other containers see the built in dict, list, and
tuple classes, and the collections (|py2stdlib-collections|) module.
.. versionadded:: 2.4
getattr(object, name[, default])~
Return the value of the named attributed of {object}. {name} must be a string.
If the string is the name of one of the object's attributes, the result is the
value of that attribute. For example, ``getattr(x, 'foobar')`` is equivalent to
``x.foobar``. If the named attribute does not exist, {default} is returned if
provided, otherwise AttributeError is raised.
globals()~
Return a dictionary representing the current global symbol table. This is always
the dictionary of the current module (inside a function or method, this is the
module where it is defined, not the module from which it is called).
hasattr(object, name)~
The arguments are an object and a string. The result is ``True`` if the string
is the name of one of the object's attributes, ``False`` if not. (This is
implemented by calling ``getattr(object, name)`` and seeing whether it raises an
exception or not.)
hash(object)~
Return the hash value of the object (if it has one). Hash values are integers.
They are used to quickly compare dictionary keys during a dictionary lookup.
Numeric values that compare equal have the same hash value (even if they are of
different types, as is the case for 1 and 1.0).
help([object])~
Invoke the built-in help system. (This function is intended for interactive
use.) If no argument is given, the interactive help system starts on the
interpreter console. If the argument is a string, then the string is looked up
as the name of a module, function, class, method, keyword, or documentation
topic, and a help page is printed on the console. If the argument is any other
kind of object, a help page on the object is generated.
This function is added to the built-in namespace by the site (|py2stdlib-site|) module.
.. versionadded:: 2.2
hex(x)~
Convert an integer number (of any size) to a hexadecimal string. The result is a
valid Python expression.
.. note:: >
To obtain a hexadecimal string representation for a float, use the
float.hex method.
<
.. versionchanged:: 2.4
Formerly only returned an unsigned literal.
id(object)~
Return the "identity" of an object. This is an integer (or long integer) which
is guaranteed to be unique and constant for this object during its lifetime.
Two objects with non-overlapping lifetimes may have the same id
value.
.. impl-detail:: This is the address of the object.
input([prompt])~
Equivalent to ``eval(raw_input(prompt))``.
.. warning:: >
This function is not safe from user errors! It expects a valid Python
expression as input; if the input is not syntactically valid, a
SyntaxError will be raised. Other exceptions may be raised if there is an
error during evaluation. (On the other hand, sometimes this is exactly what you
need when writing a quick script for expert use.)
<
If the readline (|py2stdlib-readline|) module was loaded, then input will use it to
provide elaborate line editing and history features.
Consider using the raw_input function for general input from users.
int([x[, base]])~
Convert a string or number to a plain integer. If the argument is a string,
it must contain a possibly signed decimal number representable as a Python
integer, possibly embedded in whitespace. The {base} parameter gives the
base for the conversion (which is 10 by default) and may be any integer in
the range [2, 36], or zero. If {base} is zero, the proper radix is
determined based on the contents of string; the interpretation is the same as
for integer literals. (See numbers (|py2stdlib-numbers|).) If {base} is specified and {x}
is not a string, TypeError is raised. Otherwise, the argument may be a
plain or long integer or a floating point number. Conversion of floating
point numbers to integers truncates (towards zero). If the argument is
outside the integer range a long object will be returned instead. If no
arguments are given, returns ``0``.
The integer type is described in typesnumeric.
isinstance(object, classinfo)~
Return true if the {object} argument is an instance of the {classinfo} argument,
or of a (direct or indirect) subclass thereof. Also return true if {classinfo}
is a type object (new-style class) and {object} is an object of that type or of
a (direct or indirect) subclass thereof. If {object} is not a class instance or
an object of the given type, the function always returns false. If {classinfo}
is neither a class object nor a type object, it may be a tuple of class or type
objects, or may recursively contain other such tuples (other sequence types are
not accepted). If {classinfo} is not a class, type, or tuple of classes, types,
and such tuples, a TypeError exception is raised.
.. versionchanged:: 2.2
Support for a tuple of type information was added.
issubclass(class, classinfo)~
Return true if {class} is a subclass (direct or indirect) of {classinfo}. A
class is considered a subclass of itself. {classinfo} may be a tuple of class
objects, in which case every entry in {classinfo} will be checked. In any other
case, a TypeError exception is raised.
.. versionchanged:: 2.3
Support for a tuple of type information was added.
iter(o[, sentinel])~
Return an iterator object. The first argument is interpreted very differently
depending on the presence of the second argument. Without a second argument, {o}
must be a collection object which supports the iteration protocol (the
__iter__ method), or it must support the sequence protocol (the
__getitem__ method with integer arguments starting at ``0``). If it
does not support either of those protocols, TypeError is raised. If the
second argument, {sentinel}, is given, then {o} must be a callable object. The
iterator created in this case will call {o} with no arguments for each call to
its iterator.next method; if the value returned is equal to {sentinel},
StopIteration will be raised, otherwise the value will be returned.
One useful application of the second form of iter is to read lines of
a file until a certain line is reached. The following example reads a file
until ``"STOP"`` is reached: :: >
with open("mydata.txt") as fp:
for line in iter(fp.readline, "STOP"):
process_line(line)
<
.. versionadded:: 2.2
len(s)~
Return the length (the number of items) of an object. The argument may be a
sequence (string, tuple or list) or a mapping (dictionary).
list([iterable])~
Return a list whose items are the same and in the same order as {iterable}'s
items. {iterable} may be either a sequence, a container that supports
iteration, or an iterator object. If {iterable} is already a list, a copy is
made and returned, similar to ``iterable[:]``. For instance, ``list('abc')``
returns ``['a', 'b', 'c']`` and ``list( (1, 2, 3) )`` returns ``[1, 2, 3]``. If
no argument is given, returns a new empty list, ``[]``.
list is a mutable sequence type, as documented in
typesseq. For other containers see the built in dict,
set, and tuple classes, and the collections (|py2stdlib-collections|) module.
locals()~
Update and return a dictionary representing the current local symbol table.
Free variables are returned by locals when it is called in function
blocks, but not in class blocks.
.. note:: >
The contents of this dictionary should not be modified; changes may not
affect the values of local and free variables used by the interpreter.
<
long([x[, base]])~
Convert a string or number to a long integer. If the argument is a string, it
must contain a possibly signed number of arbitrary size, possibly embedded in
whitespace. The {base} argument is interpreted in the same way as for
int, and may only be given when {x} is a string. Otherwise, the argument
may be a plain or long integer or a floating point number, and a long integer
with the same value is returned. Conversion of floating point numbers to
integers truncates (towards zero). If no arguments are given, returns ``0L``.
The long type is described in typesnumeric.
map(function, iterable, ...)~
Apply {function} to every item of {iterable} and return a list of the results.
If additional {iterable} arguments are passed, {function} must take that many
arguments and is applied to the items from all iterables in parallel. If one
iterable is shorter than another it is assumed to be extended with ``None``
items. If {function} is ``None``, the identity function is assumed; if there
are multiple arguments, map returns a list consisting of tuples
containing the corresponding items from all iterables (a kind of transpose
operation). The {iterable} arguments may be a sequence or any iterable object;
the result is always a list.
max(iterable[, args...][key])~
With a single argument {iterable}, return the largest item of a non-empty
iterable (such as a string, tuple or list). With more than one argument, return
the largest of the arguments.
The optional {key} argument specifies a one-argument ordering function like that
used for list.sort. The {key} argument, if supplied, must be in keyword
form (for example, ``max(a,b,c,key=func)``).
.. versionchanged:: 2.5
Added support for the optional {key} argument.
memoryview(obj)~
Return a "memory view" object created from the given argument. See
typememoryview for more information.
min(iterable[, args...][key])~
With a single argument {iterable}, return the smallest item of a non-empty
iterable (such as a string, tuple or list). With more than one argument, return
the smallest of the arguments.
The optional {key} argument specifies a one-argument ordering function like that
used for list.sort. The {key} argument, if supplied, must be in keyword
form (for example, ``min(a,b,c,key=func)``).
.. versionchanged:: 2.5
Added support for the optional {key} argument.
next(iterator[, default])~
Retrieve the next item from the {iterator} by calling its
iterator.next method. If {default} is given, it is returned if the
iterator is exhausted, otherwise StopIteration is raised.
.. versionadded:: 2.6
object()~
Return a new featureless object. object is a base for all new style
classes. It has the methods that are common to all instances of new style
classes.
.. versionadded:: 2.2
.. versionchanged:: 2.3
This function does not accept any arguments. Formerly, it accepted arguments but
ignored them.
oct(x)~
Convert an integer number (of any size) to an octal string. The result is a
valid Python expression.
.. versionchanged:: 2.4
Formerly only returned an unsigned literal.
open(filename[, mode[, bufsize]])~
Open a file, returning an object of the file type described in
section bltin-file-objects. If the file cannot be opened,
IOError is raised. When opening a file, it's preferable to use
open instead of invoking the file constructor directly.
The first two arguments are the same as for ``stdio``'s fopen:
{filename} is the file name to be opened, and {mode} is a string indicating how
the file is to be opened.
The most commonly-used values of {mode} are ``'r'`` for reading, ``'w'`` for
writing (truncating the file if it already exists), and ``'a'`` for appending
(which on {some} Unix systems means that {all} writes append to the end of the
file regardless of the current seek position). If {mode} is omitted, it
defaults to ``'r'``. The default is to use text mode, which may convert
``'\n'`` characters to a platform-specific representation on writing and back
on reading. Thus, when opening a binary file, you should append ``'b'`` to
the {mode} value to open the file in binary mode, which will improve
portability. (Appending ``'b'`` is useful even on systems that don't treat
binary and text files differently, where it serves as documentation.) See below
for more possible values of {mode}.
.. index::
single: line-buffered I/O
single: unbuffered I/O
single: buffer size, I/O
single: I/O control; buffering
The optional {bufsize} argument specifies the file's desired buffer size: 0
means unbuffered, 1 means line buffered, any other positive value means use a
buffer of (approximately) that size. A negative {bufsize} means to use the
system default, which is usually line buffered for tty devices and fully
buffered for other files. If omitted, the system default is used. [#]_
Modes ``'r+'``, ``'w+'`` and ``'a+'`` open the file for updating (note that
``'w+'`` truncates the file). Append ``'b'`` to the mode to open the file in
binary mode, on systems that differentiate between binary and text files; on
systems that don't have this distinction, adding the ``'b'`` has no effect.
In addition to the standard fopen values {mode} may be ``'U'`` or
``'rU'``. Python is usually built with universal newline support; supplying
``'U'`` opens the file as a text file, but lines may be terminated by any of the
following: the Unix end-of-line convention ``'\n'``, the Macintosh convention
``'\r'``, or the Windows convention ``'\r\n'``. All of these external
representations are seen as ``'\n'`` by the Python program. If Python is built
without universal newline support a {mode} with ``'U'`` is the same as normal
text mode. Note that file objects so opened also have an attribute called
newlines which has a value of ``None`` (if no newlines have yet been
seen), ``'\n'``, ``'\r'``, ``'\r\n'``, or a tuple containing all the newline
types seen.
Python enforces that the mode, after stripping ``'U'``, begins with ``'r'``,
``'w'`` or ``'a'``.
Python provides many file handling modules including
fileinput (|py2stdlib-fileinput|), os (|py2stdlib-os|), os.path (|py2stdlib-os.path|), tempfile (|py2stdlib-tempfile|), and
shutil (|py2stdlib-shutil|).
.. versionchanged:: 2.5
Restriction on first letter of mode string introduced.
ord(c)~
Given a string of length one, return an integer representing the Unicode code
point of the character when the argument is a unicode object, or the value of
the byte when the argument is an 8-bit string. For example, ``ord('a')`` returns
the integer ``97``, ``ord(u'\u2020')`` returns ``8224``. This is the inverse of
chr for 8-bit strings and of unichr for unicode objects. If a
unicode argument is given and Python was built with UCS2 Unicode, then the
character's code point must be in the range [0..65535] inclusive; otherwise the
string length is two, and a TypeError will be raised.
pow(x, y[, z])~
Return {x} to the power {y}; if {z} is present, return {x} to the power {y},
modulo {z} (computed more efficiently than ``pow(x, y) % z``). The two-argument
form ``pow(x, y)`` is equivalent to using the power operator: ``x{}y``.
The arguments must have numeric types. With mixed operand types, the coercion
rules for binary arithmetic operators apply. For int and long int operands, the
result has the same type as the operands (after coercion) unless the second
argument is negative; in that case, all arguments are converted to float and a
float result is delivered. For example, ``10{}2`` returns ``100``, but
``10{}-2`` returns ``0.01``. (This last feature was added in Python 2.2. In
Python 2.1 and before, if both arguments were of integer types and the second
argument was negative, an exception was raised.) If the second argument is
negative, the third argument must be omitted. If {z} is present, {x} and {y}
must be of integer types, and {y} must be non-negative. (This restriction was
added in Python 2.2. In Python 2.1 and before, floating 3-argument ``pow()``
returned platform-dependent results depending on floating-point rounding
accidents.)
print([object, ...][, sep=' '][, end='\\n'][, file=sys.stdout])~
Print {object}\(s) to the stream {file}, separated by {sep} and followed by
{end}. {sep}, {end} and {file}, if present, must be given as keyword
arguments.
All non-keyword arguments are converted to strings like str does and
written to the stream, separated by {sep} and followed by {end}. Both {sep}
and {end} must be strings; they can also be ``None``, which means to use the
default values. If no {object} is given, print will just write
{end}.
The {file} argument must be an object with a ``write(string)`` method; if it
is not present or ``None``, sys.stdout will be used.
.. note:: >
This function is not normally available as a built-in since the name
``print`` is recognized as the print statement. To disable the
statement and use the print function, use this future statement at
the top of your module::
from __future__ import print_function
<
.. versionadded:: 2.6
property([fget[, fset[, fdel[, doc]]]])~
Return a property attribute for new-style class\es (classes that
derive from object).
{fget} is a function for getting an attribute value, likewise {fset} is a
function for setting, and {fdel} a function for del'ing, an attribute. Typical
use is to define a managed attribute x:: >
class C(object):
def __init__(self):
self._x = None
def getx(self):
return self._x
def setx(self, value):
self._x = value
def delx(self):
del self._x
x = property(getx, setx, delx, "I'm the 'x' property.")
<
If given, {doc} will be the docstring of the property attribute. Otherwise, the
property will copy {fget}'s docstring (if it exists). This makes it possible to
create read-only properties easily using property as a decorator:: >
class Parrot(object):
def __init__(self):
self._voltage = 100000
@property
def voltage(self):
"""Get the current voltage."""
return self._voltage
<
turns the voltage method into a "getter" for a read-only attribute
with the same name.
A property object has getter, setter, and deleter
methods usable as decorators that create a copy of the property with the
corresponding accessor function set to the decorated function. This is
best explained with an example:: >
class C(object):
def __init__(self):
self._x = None
@property
def x(self):
"""I'm the 'x' property."""
return self._x
@x.setter
def x(self, value):
self._x = value
@x.deleter
def x(self):
del self._x
<
This code is exactly equivalent to the first example. Be sure to give the
additional functions the same name as the original property (``x`` in this
case.)
The returned property also has the attributes ``fget``, ``fset``, and
``fdel`` corresponding to the constructor arguments.
.. versionadded:: 2.2
.. versionchanged:: 2.5
Use {fget}'s docstring if no {doc} given.
.. versionchanged:: 2.6
The ``getter``, ``setter``, and ``deleter`` attributes were added.
range([start,] stop[, step])~
This is a versatile function to create lists containing arithmetic progressions.
It is most often used in for loops. The arguments must be plain
integers. If the {step} argument is omitted, it defaults to ``1``. If the
{start} argument is omitted, it defaults to ``0``. The full form returns a list
of plain integers ``[start, start + step, start + 2 { step, ...]``. If }step*
is positive, the last element is the largest ``start + i * step`` less than
{stop}; if {step} is negative, the last element is the smallest ``start + i *
step`` greater than {stop}. {step} must not be zero (or else ValueError
is raised). Example:
>>> range(10)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> range(1, 11)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> range(0, 30, 5)
[0, 5, 10, 15, 20, 25]
>>> range(0, 10, 3)
[0, 3, 6, 9]
>>> range(0, -10, -1)
[0, -1, -2, -3, -4, -5, -6, -7, -8, -9]
>>> range(0)
[]
>>> range(1, 0)
[]
raw_input([prompt])~
If the {prompt} argument is present, it is written to standard output without a
trailing newline. The function then reads a line from input, converts it to a
string (stripping a trailing newline), and returns that. When EOF is read,
EOFError is raised. Example:: >
>>> s = raw_input('--> ')
--> Monty Python's Flying Circus
>>> s
"Monty Python's Flying Circus"
<
If the readline (|py2stdlib-readline|) module was loaded, then raw_input will use it to
provide elaborate line editing and history features.
reduce(function, iterable[, initializer])~
Apply {function} of two arguments cumulatively to the items of {iterable}, from
left to right, so as to reduce the iterable to a single value. For example,
``reduce(lambda x, y: x+y, [1, 2, 3, 4, 5])`` calculates ``((((1+2)+3)+4)+5)``.
The left argument, {x}, is the accumulated value and the right argument, {y}, is
the update value from the {iterable}. If the optional {initializer} is present,
it is placed before the items of the iterable in the calculation, and serves as
a default when the iterable is empty. If {initializer} is not given and
{iterable} contains only one item, the first item is returned.
reload(module)~
Reload a previously imported {module}. The argument must be a module object, so
it must have been successfully imported before. This is useful if you have
edited the module source file using an external editor and want to try out the
new version without leaving the Python interpreter. The return value is the
module object (the same as the {module} argument).
When ``reload(module)`` is executed:
* Python modules' code is recompiled and the module-level code reexecuted,
defining a new set of objects which are bound to names in the module's
dictionary. The ``init`` function of extension modules is not called a second
time.
* As with all other objects in Python the old objects are only reclaimed after
their reference counts drop to zero.
* The names in the module namespace are updated to point to any new or changed
objects.
* Other references to the old objects (such as names external to the module) are
not rebound to refer to the new objects and must be updated in each namespace
where they occur if that is desired.
There are a number of other caveats:
If a module is syntactically correct but its initialization fails, the first
import statement for it does not bind its name locally, but does
store a (partially initialized) module object in ``sys.modules``. To reload the
module you must first import it again (this will bind the name to the
partially initialized module object) before you can reload it.
When a module is reloaded, its dictionary (containing the module's global
variables) is retained. Redefinitions of names will override the old
definitions, so this is generally not a problem. If the new version of a module
does not define a name that was defined by the old version, the old definition
remains. This feature can be used to the module's advantage if it maintains a
global table or cache of objects --- with a try statement it can test
for the table's presence and skip its initialization if desired:: >
try:
cache
except NameError:
cache = {}
<
It is legal though generally not very useful to reload built-in or dynamically
loaded modules, except for sys (|py2stdlib-sys|), __main__ (|py2stdlib-__main__|) and builtin (|py2stdlib-builtin|).
In many cases, however, extension modules are not designed to be initialized
more than once, and may fail in arbitrary ways when reloaded.
If a module imports objects from another module using from ...
import ..., calling reload for the other module does not
redefine the objects imported from it --- one way around this is to re-execute
the from statement, another is to use import and qualified
names ({module}.{name}) instead.
If a module instantiates instances of a class, reloading the module that defines
the class does not affect the method definitions of the instances --- they
continue to use the old class definition. The same is true for derived classes.
repr(object)~
Return a string containing a printable representation of an object. This is
the same value yielded by conversions (reverse quotes). It is sometimes
useful to be able to access this operation as an ordinary function. For many
types, this function makes an attempt to return a string that would yield an
object with the same value when passed to eval, otherwise the
representation is a string enclosed in angle brackets that contains the name
of the type of the object together with additional information often
including the name and address of the object. A class can control what this
function returns for its instances by defining a __repr__ method.
reversed(seq)~
Return a reverse iterator. {seq} must be an object which has
a __reversed__ method or supports the sequence protocol (the
__len__ method and the __getitem__ method with integer
arguments starting at ``0``).
.. versionadded:: 2.4
.. versionchanged:: 2.6
Added the possibility to write a custom __reversed__ method.
round(x[, n])~
Return the floating point value {x} rounded to {n} digits after the decimal
point. If {n} is omitted, it defaults to zero. The result is a floating point
number. Values are rounded to the closest multiple of 10 to the power minus
{n}; if two multiples are equally close, rounding is done away from 0 (so. for
example, ``round(0.5)`` is ``1.0`` and ``round(-0.5)`` is ``-1.0``).
set([iterable])~
Return a new set, optionally with elements taken from {iterable}.
The set type is described in types-set.
For other containers see the built in dict, list, and
tuple classes, and the collections (|py2stdlib-collections|) module.
.. versionadded:: 2.4
setattr(object, name, value)~
This is the counterpart of getattr. The arguments are an object, a
string and an arbitrary value. The string may name an existing attribute or a
new attribute. The function assigns the value to the attribute, provided the
object allows it. For example, ``setattr(x, 'foobar', 123)`` is equivalent to
``x.foobar = 123``.
slice([start,] stop[, step])~
.. index:: single: Numerical Python
Return a slice object representing the set of indices specified by
``range(start, stop, step)``. The {start} and {step} arguments default to
``None``. Slice objects have read-only data attributes start,
stop and step which merely return the argument values (or their
default). They have no other explicit functionality; however they are used by
Numerical Python and other third party extensions. Slice objects are also
generated when extended indexing syntax is used. For example:
``a[start:stop:step]`` or ``a[start:stop, i]``. See itertools.islice
for an alternate version that returns an iterator.
sorted(iterable[, cmp[, key[, reverse]]])~
Return a new sorted list from the items in {iterable}.
The optional arguments {cmp}, {key}, and {reverse} have the same meaning as
those for the list.sort method (described in section
typesseq-mutable).
{cmp} specifies a custom comparison function of two arguments (iterable
elements) which should return a negative, zero or positive number depending on
whether the first argument is considered smaller than, equal to, or larger than
the second argument: ``cmp=lambda x,y: cmp(x.lower(), y.lower())``. The default
value is ``None``.
{key} specifies a function of one argument that is used to extract a comparison
key from each list element: ``key=str.lower``. The default value is ``None``
(compare the elements directly).
{reverse} is a boolean value. If set to ``True``, then the list elements are
sorted as if each comparison were reversed.
In general, the {key} and {reverse} conversion processes are much faster
than specifying an equivalent {cmp} function. This is because {cmp} is
called multiple times for each list element while {key} and {reverse} touch
each element only once. Use functools.cmp_to_key to convert an
old-style {cmp} function to a {key} function.
For sorting examples and a brief sorting tutorial, see `Sorting HowTo
<http://wiki.python.org/moin/HowTo/Sorting/>`_\.
.. versionadded:: 2.4
staticmethod(function)~
Return a static method for {function}.
A static method does not receive an implicit first argument. To declare a static
method, use this idiom:: >
class C:
@staticmethod
def f(arg1, arg2, ...): ...
<
The ``@staticmethod`` form is a function decorator -- see the
description of function definitions in function for details.
It can be called either on the class (such as ``C.f()``) or on an instance (such
as ``C().f()``). The instance is ignored except for its class.
Static methods in Python are similar to those found in Java or C++. For a more
advanced concept, see classmethod in this section.
For more information on static methods, consult the documentation on the
standard type hierarchy in types (|py2stdlib-types|).
.. versionadded:: 2.2
.. versionchanged:: 2.4
Function decorator syntax added.
str([object])~
Return a string containing a nicely printable representation of an object. For
strings, this returns the string itself. The difference with ``repr(object)``
is that ``str(object)`` does not always attempt to return a string that is
acceptable to eval; its goal is to return a printable string. If no
argument is given, returns the empty string, ``''``.
For more information on strings see typesseq which describes sequence
functionality (strings are sequences), and also the string-specific methods
described in the string-methods section. To output formatted strings
use template strings or the ``%`` operator described in the
string-formatting section. In addition see the stringservices
section. See also unicode.
sum(iterable[, start])~
Sums {start} and the items of an {iterable} from left to right and returns the
total. {start} defaults to ``0``. The {iterable}'s items are normally numbers,
and are not allowed to be strings. The fast, correct way to concatenate a
sequence of strings is by calling ``''.join(sequence)``. Note that
``sum(range(n), m)`` is equivalent to ``reduce(operator.add, range(n), m)``
To add floating point values with extended precision, see math.fsum\.
.. versionadded:: 2.3
super(type[, object-or-type])~
Return a proxy object that delegates method calls to a parent or sibling
class of {type}. This is useful for accessing inherited methods that have
been overridden in a class. The search order is same as that used by
getattr except that the {type} itself is skipped.
The __mro__ attribute of the {type} lists the method resolution
search order used by both getattr and super. The attribute
is dynamic and can change whenever the inheritance hierarchy is updated.
If the second argument is omitted, the super object returned is unbound. If
the second argument is an object, ``isinstance(obj, type)`` must be true. If
the second argument is a type, ``issubclass(type2, type)`` must be true (this
is useful for classmethods).
.. note::
super only works for new-style class\es.
There are two typical use cases for {super}. In a class hierarchy with
single inheritance, {super} can be used to refer to parent classes without
naming them explicitly, thus making the code more maintainable. This use
closely parallels the use of {super} in other programming languages.
The second use case is to support cooperative multiple inheritance in a
dynamic execution environment. This use case is unique to Python and is
not found in statically compiled languages or languages that only support
single inheritance. This makes it possible to implement "diamond diagrams"
where multiple base classes implement the same method. Good design dictates
that this method have the same calling signature in every case (because the
order of calls is determined at runtime, because that order adapts
to changes in the class hierarchy, and because that order can include
sibling classes that are unknown prior to runtime).
For both use cases, a typical superclass call looks like this:: >
class C(B):
def method(self, arg):
super(C, self).method(arg)
<
Note that super is implemented as part of the binding process for
explicit dotted attribute lookups such as ``super().__getitem__(name)``.
It does so by implementing its own __getattribute__ method for searching
classes in a predictable order that supports cooperative multiple inheritance.
Accordingly, super is undefined for implicit lookups using statements or
operators such as ``super()[name]``.
Also note that super is not limited to use inside methods. The two
argument form specifies the arguments exactly and makes the appropriate
references.
.. versionadded:: 2.2
tuple([iterable])~
Return a tuple whose items are the same and in the same order as {iterable}'s
items. {iterable} may be a sequence, a container that supports iteration, or an
iterator object. If {iterable} is already a tuple, it is returned unchanged.
For instance, ``tuple('abc')`` returns ``('a', 'b', 'c')`` and ``tuple([1, 2,
3])`` returns ``(1, 2, 3)``. If no argument is given, returns a new empty
tuple, ``()``.
tuple is an immutable sequence type, as documented in
typesseq. For other containers see the built in dict,
list, and set classes, and the collections (|py2stdlib-collections|) module.
type(object)~
.. index:: object: type
Return the type of an {object}. The return value is a type object. The
isinstance built-in function is recommended for testing the type of an
object.
With three arguments, type functions as a constructor as detailed below.
type(name, bases, dict)~
Return a new type object. This is essentially a dynamic form of the
class statement. The {name} string is the class name and becomes the
__name__ attribute; the {bases} tuple itemizes the base classes and
becomes the __bases__ attribute; and the {dict} dictionary is the
namespace containing definitions for class body and becomes the __dict__
attribute. For example, the following two statements create identical
type objects:
>>> class X(object):
... a = 1
...
>>> X = type('X', (object,), dict(a=1))
.. versionadded:: 2.2
unichr(i)~
Return the Unicode string of one character whose Unicode code is the integer
{i}. For example, ``unichr(97)`` returns the string ``u'a'``. This is the
inverse of ord for Unicode strings. The valid range for the argument
depends how Python was configured -- it may be either UCS2 [0..0xFFFF] or UCS4
[0..0x10FFFF]. ValueError is raised otherwise. For ASCII and 8-bit
strings see chr.
.. versionadded:: 2.0
unicode([object[, encoding [, errors]]])~
Return the Unicode string version of {object} using one of the following modes:
If {encoding} and/or {errors} are given, ``unicode()`` will decode the object
which can either be an 8-bit string or a character buffer using the codec for
{encoding}. The {encoding} parameter is a string giving the name of an encoding;
if the encoding is not known, LookupError is raised. Error handling is
done according to {errors}; this specifies the treatment of characters which are
invalid in the input encoding. If {errors} is ``'strict'`` (the default), a
ValueError is raised on errors, while a value of ``'ignore'`` causes
errors to be silently ignored, and a value of ``'replace'`` causes the official
Unicode replacement character, ``U+FFFD``, to be used to replace input
characters which cannot be decoded. See also the codecs (|py2stdlib-codecs|) module.
If no optional parameters are given, ``unicode()`` will mimic the behaviour of
``str()`` except that it returns Unicode strings instead of 8-bit strings. More
precisely, if {object} is a Unicode string or subclass it will return that
Unicode string without any additional decoding applied.
For objects which provide a __unicode__ method, it will call this method
without arguments to create a Unicode string. For all other objects, the 8-bit
string version or representation is requested and then converted to a Unicode
string using the codec for the default encoding in ``'strict'`` mode.
For more information on Unicode strings see typesseq which describes
sequence functionality (Unicode strings are sequences), and also the
string-specific methods described in the string-methods section. To
output formatted strings use template strings or the ``%`` operator described
in the string-formatting section. In addition see the
stringservices section. See also str.
.. versionadded:: 2.0
.. versionchanged:: 2.2
Support for __unicode__ added.
vars([object])~
Without an argument, act like locals.
With a module, class or class instance object as argument (or anything else that
has a __dict__ attribute), return that attribute.
.. note:: >
The returned dictionary should not be modified:
the effects on the corresponding symbol table are undefined. [#]_
<
xrange([start,] stop[, step])~
This function is very similar to range, but returns an "xrange object"
instead of a list. This is an opaque sequence type which yields the same values
as the corresponding list, without actually storing them all simultaneously.
The advantage of xrange over range is minimal (since
xrange still has to create the values when asked for them) except when a
very large range is used on a memory-starved machine or when all of the range's
elements are never used (such as when the loop is usually terminated with
break).
.. impl-detail:: >
xrange is intended to be simple and fast. Implementations may
impose restrictions to achieve this. The C implementation of Python
restricts all arguments to native C longs ("short" Python integers), and
also requires that the number of elements fit in a native C long. If a
larger range is needed, an alternate version can be crafted using the
itertools (|py2stdlib-itertools|) module: ``islice(count(start, step),
(stop-start+step-1)//step)``.
<
zip([iterable, ...])~
This function returns a list of tuples, where the {i}-th tuple contains the
{i}-th element from each of the argument sequences or iterables. The returned
list is truncated in length to the length of the shortest argument sequence.
When there are multiple arguments which are all of the same length, zip
is similar to map with an initial argument of ``None``. With a single
sequence argument, it returns a list of 1-tuples. With no arguments, it returns
an empty list.
The left-to-right evaluation order of the iterables is guaranteed. This
makes possible an idiom for clustering a data series into n-length groups
using ``zip({[iter(s)]}n)``.
zip in conjunction with the ``*`` operator can be used to unzip a
list:: >
>>> x = [1, 2, 3]
>>> y = [4, 5, 6]
>>> zipped = zip(x, y)
>>> zipped
[(1, 4), (2, 5), (3, 6)]
>>> x2, y2 = zip(*zipped)
>>> x == list(x2) and y == list(y2)
True
<
.. versionadded:: 2.0
.. versionchanged:: 2.4
Formerly, zip required at least one argument and ``zip()`` raised a
TypeError instead of returning an empty list.
__import__(name[, globals[, locals[, fromlist[, level]]]])~
.. index::
statement: import
module: imp
.. note:: >
This is an advanced function that is not needed in everyday Python
programming.
<
This function is invoked by the import statement. It can be
replaced (by importing the builtin (|py2stdlib-builtin|) module and assigning to
``__builtin__.__import__``) in order to change semantics of the
import statement, but nowadays it is usually simpler to use import
hooks (see 302). Direct use of __import__ is rare, except in
cases where you want to import a module whose name is only known at runtime.
The function imports the module {name}, potentially using the given {globals}
and {locals} to determine how to interpret the name in a package context.
The {fromlist} gives the names of objects or submodules that should be
imported from the module given by {name}. The standard implementation does
not use its {locals} argument at all, and uses its {globals} only to
determine the package context of the import statement.
{level} specifies whether to use absolute or relative imports. The default
is ``-1`` which indicates both absolute and relative imports will be
attempted. ``0`` means only perform absolute imports. Positive values for
{level} indicate the number of parent directories to search relative to the
directory of the module calling __import__.
When the {name} variable is of the form ``package.module``, normally, the
top-level package (the name up till the first dot) is returned, {not} the
module named by {name}. However, when a non-empty {fromlist} argument is
given, the module named by {name} is returned.
For example, the statement ``import spam`` results in bytecode resembling the
following code:: >
spam = __import__('spam', globals(), locals(), [], -1)
<
The statement ``import spam.ham`` results in this call::
spam = __import__('spam.ham', globals(), locals(), [], -1)
Note how __import__ returns the toplevel module here because this is
the object that is bound to a name by the import statement.
On the other hand, the statement ``from spam.ham import eggs, sausage as
saus`` results in :: >
_temp = __import__('spam.ham', globals(), locals(), ['eggs', 'sausage'], -1)
eggs = _temp.eggs
saus = _temp.sausage
<
Here, the ``spam.ham`` module is returned from __import__. From this
object, the names to import are retrieved and assigned to their respective
names.
If you simply want to import a module (potentially within a package) by name,
you can call __import__ and then look it up in sys.modules:: >
>>> import sys
>>> name = 'foo.bar.baz'
>>> __import__(name)
<module 'foo' from ...>
>>> baz = sys.modules[name]
>>> baz
<module 'foo.bar.baz' from ...>
<
.. versionchanged:: 2.5
The level parameter was added.
.. versionchanged:: 2.5
Keyword support for parameters was added.
.. ---------------------------------------------------------------------------
Non-essential Built-in Functions
================================
There are several built-in functions that are no longer essential to learn, know
or use in modern Python programming. They have been kept here to maintain
backwards compatibility with programs written for older versions of Python.
Python programmers, trainers, students and book writers should feel free to
bypass these functions without concerns about missing something important.
apply(function, args[, keywords])~
The {function} argument must be a callable object (a user-defined or built-in
function or method, or a class object) and the {args} argument must be a
sequence. The {function} is called with {args} as the argument list; the number
of arguments is the length of the tuple. If the optional {keywords} argument is
present, it must be a dictionary whose keys are strings. It specifies keyword
arguments to be added to the end of the argument list. Calling apply is
different from just calling ``function(args)``, since in that case there is
always exactly one argument. The use of apply is equivalent to
``function({args, }*keywords)``.
2.3~
Use the extended call syntax with ``{args`` and ``}*keywords`` instead.
buffer(object[, offset[, size]])~
The {object} argument must be an object that supports the buffer call interface
(such as strings, arrays, and buffers). A new buffer object will be created
which references the {object} argument. The buffer object will be a slice from
the beginning of {object} (or from the specified {offset}). The slice will
extend to the end of {object} (or will have a length given by the {size}
argument).
coerce(x, y)~
Return a tuple consisting of the two numeric arguments converted to a common
type, using the same rules as used by arithmetic operations. If coercion is not
possible, raise TypeError.
intern(string)~
Enter {string} in the table of "interned" strings and return the interned string
-- which is {string} itself or a copy. Interning strings is useful to gain a
little performance on dictionary lookup -- if the keys in a dictionary are
interned, and the lookup key is interned, the key comparisons (after hashing)
can be done by a pointer compare instead of a string compare. Normally, the
names used in Python programs are automatically interned, and the dictionaries
used to hold module, class or instance attributes have interned keys.
.. versionchanged:: 2.3
Interned strings are not immortal (like they used to be in Python 2.2 and
before); you must keep a reference to the return value of intern around
to benefit from it.
.. rubric:: Footnotes
.. [#] It is used relatively rarely so does not warrant being made into a statement.
.. [#] Specifying a buffer size currently has no effect on systems that don't have
setvbuf. The interface to specify the buffer size is not done using a
method that calls setvbuf, because that may dump core when called after
any I/O has been performed, and there's no reliable way to determine whether
this is the case.
.. [#] In the current implementation, local variable bindings cannot normally be
affected this way, but variables retrieved from other scopes (such as modules)
can be. This may change.
*py2stdlib-builtin:Constants*
Constants~
Built-in Constants
==================
A small number of constants live in the built-in namespace. They are:
False~
The false value of the bool type.
.. versionadded:: 2.3
True~
The true value of the bool type.
.. versionadded:: 2.3
None~
The sole value of types.NoneType. ``None`` is frequently used to
represent the absence of a value, as when default arguments are not passed to a
function.
.. versionchanged:: 2.4
Assignments to ``None`` are illegal and raise a SyntaxError.
NotImplemented~
Special value which can be returned by the "rich comparison" special methods
(__eq__, __lt__, and friends), to indicate that the comparison
is not implemented with respect to the other type.
Ellipsis~
Special value used in conjunction with extended slicing syntax.
.. XXX Someone who understands extended slicing should fill in here.
__debug__~
This constant is true if Python was not started with an -O option.
Assignments to __debug__ are illegal and raise a SyntaxError.
See also the assert statement.
Constants added by the site (|py2stdlib-site|) module
-----------------------------------------
The site (|py2stdlib-site|) module (which is imported automatically during startup, except
if the -S command-line option is given) adds several constants to the
built-in namespace. They are useful for the interactive interpreter shell and
should not be used in programs.
quit([code=None])~
exit([code=None])
Objects that when printed, print a message like "Use quit() or Ctrl-D
(i.e. EOF) to exit", and when called, raise SystemExit with the
specified exit code.
copyright~
license
credits
Objects that when printed, print a message like "Type license() to see the
full license text", and when called, display the corresponding text in a
pager-like fashion (one screen at a time).
*py2stdlib-builtin:Types*
Types~
.. XXX: reference/datamodel and this have quite a few overlaps!
{}
Built-in Types
**************
{}
The following sections describe the standard types that are built into the
interpreter.
.. note::
Historically (until release 2.2), Python's built-in types have differed from
user-defined types because it was not possible to use the built-in types as the
basis for object-oriented inheritance. This limitation no longer
exists.
.. index:: pair: built-in; types
The principal built-in types are numerics, sequences, mappings, files, classes,
instances and exceptions.
.. index:: statement: print
Some operations are supported by several object types; in particular,
practically all objects can be compared, tested for truth value, and converted
to a string (with the repr (|py2stdlib-repr|) function or the slightly different
str function). The latter function is implicitly used when an object is
written by the print function.
Truth Value Testing
===================
.. index::
statement: if
statement: while
pair: truth; value
pair: Boolean; operations
single: false
Any object can be tested for truth value, for use in an if or
while condition or as operand of the Boolean operations below. The
following values are considered false:
.. index:: single: None (Built-in object)
* ``None``
.. index:: single: False (Built-in object)
* ``False``
* zero of any numeric type, for example, ``0``, ``0L``, ``0.0``, ``0j``.
* any empty sequence, for example, ``''``, ``()``, ``[]``.
* any empty mapping, for example, ``{}``.
* instances of user-defined classes, if the class defines a __nonzero__
or __len__ method, when that method returns the integer zero or
bool value ``False``. [#]_
.. index:: single: true
All other values are considered true --- so objects of many types are always
true.
.. index::
operator: or
operator: and
single: False
single: True
Operations and built-in functions that have a Boolean result always return ``0``
or ``False`` for false and ``1`` or ``True`` for true, unless otherwise stated.
(Important exception: the Boolean operations ``or`` and ``and`` always return
one of their operands.)
Boolean Operations --- and, or, not
====================================================================
.. index:: pair: Boolean; operations
These are the Boolean operations, ordered by ascending priority:
+-------------+---------------------------------+-------+
| Operation | Result | Notes |
+=============+=================================+=======+
| ``x or y`` | if {x} is false, then {y}, else | \(1) |
| | {x} | |
+-------------+---------------------------------+-------+
| ``x and y`` | if {x} is false, then {x}, else | \(2) |
| | {y} | |
+-------------+---------------------------------+-------+
| ``not x`` | if {x} is false, then ``True``, | \(3) |
| | else ``False`` | |
+-------------+---------------------------------+-------+
.. index::
operator: and
operator: or
operator: not
Notes:
(1)
This is a short-circuit operator, so it only evaluates the second
argument if the first one is False.
(2)
This is a short-circuit operator, so it only evaluates the second
argument if the first one is True.
(3)
``not`` has a lower priority than non-Boolean operators, so ``not a == b`` is
interpreted as ``not (a == b)``, and ``a == not b`` is a syntax error.
Comparisons
===========
.. index::
pair: chaining; comparisons
pair: operator; comparison
operator: ==
operator: <
operator: <=
operator: >
operator: >=
operator: !=
operator: is
operator: is not
Comparison operations are supported by all objects. They all have the same
priority (which is higher than that of the Boolean operations). Comparisons can
be chained arbitrarily; for example, ``x < y <= z`` is equivalent to ``x < y and
y <= z``, except that {y} is evaluated only once (but in both cases {z} is not
evaluated at all when ``x < y`` is found to be false).
This table summarizes the comparison operations:
+------------+-------------------------+-------+
| Operation | Meaning | Notes |
+============+=========================+=======+
| ``<`` | strictly less than | |
+------------+-------------------------+-------+
| ``<=`` | less than or equal | |
+------------+-------------------------+-------+
| ``>`` | strictly greater than | |
+------------+-------------------------+-------+
| ``>=`` | greater than or equal | |
+------------+-------------------------+-------+
| ``==`` | equal | |
+------------+-------------------------+-------+
| ``!=`` | not equal | \(1) |
+------------+-------------------------+-------+
| ``is`` | object identity | |
+------------+-------------------------+-------+
| ``is not`` | negated object identity | |
+------------+-------------------------+-------+
Notes:
(1)
``!=`` can also be written ``<>``, but this is an obsolete usage
kept for backwards compatibility only. New code should always use
``!=``.
.. index::
pair: object; numeric
pair: objects; comparing
Objects of different types, except different numeric types and different string
types, never compare equal; such objects are ordered consistently but
arbitrarily (so that sorting a heterogeneous array yields a consistent result).
Furthermore, some types (for example, file objects) support only a degenerate
notion of comparison where any two objects of that type are unequal. Again,
such objects are ordered arbitrarily but consistently. The ``<``, ``<=``, ``>``
and ``>=`` operators will raise a TypeError exception when any operand is
a complex number.
.. index:: single: __cmp__() (instance method)
Instances of a class normally compare as non-equal unless the class defines the
__cmp__ method. Refer to customization) for information on the
use of this method to effect object comparisons.
.. impl-detail::
Objects of different types except numbers are ordered by their type names;
objects of the same types that don't support proper comparison are ordered by
their address.
.. index::
operator: in
operator: not in
Two more operations with the same syntactic priority, ``in`` and ``not in``, are
supported only by sequence types (below).
Numeric Types --- int, float, long, complex
===============================================================================
.. index::
object: numeric
object: Boolean
object: integer
object: long integer
object: floating point
object: complex number
pair: C; language
There are four distinct numeric types: plain integers, :dfn:`long
integers`, floating point numbers, and complex numbers. In
addition, Booleans are a subtype of plain integers. Plain integers (also just
called integers) are implemented using long in C, which gives
them at least 32 bits of precision (``sys.maxint`` is always set to the maximum
plain integer value for the current platform, the minimum value is
``-sys.maxint - 1``). Long integers have unlimited precision. Floating point
numbers are implemented using double in C. All bets on their precision
are off unless you happen to know the machine you are working with.
Complex numbers have a real and imaginary part, which are each implemented using
double in C. To extract these parts from a complex number {z}, use
``z.real`` and ``z.imag``.
.. index::
pair: numeric; literals
pair: integer; literals
triple: long; integer; literals
pair: floating point; literals
pair: complex number; literals
pair: hexadecimal; literals
pair: octal; literals
Numbers are created by numeric literals or as the result of built-in functions
and operators. Unadorned integer literals (including binary, hex, and octal
numbers) yield plain integers unless the value they denote is too large to be
represented as a plain integer, in which case they yield a long integer.
Integer literals with an ``'L'`` or ``'l'`` suffix yield long integers (``'L'``
is preferred because ``1l`` looks too much like eleven!). Numeric literals
containing a decimal point or an exponent sign yield floating point numbers.
Appending ``'j'`` or ``'J'`` to a numeric literal yields a complex number with a
zero real part. A complex numeric literal is the sum of a real and an imaginary
part.
.. index::
single: arithmetic
builtin: int
builtin: long
builtin: float
builtin: complex
operator: +
operator: -
operator: *
operator: /
operator: //
operator: %
operator: {}
Python fully supports mixed arithmetic: when a binary arithmetic operator has
operands of different numeric types, the operand with the "narrower" type is
widened to that of the other, where plain integer is narrower than long integer
is narrower than floating point is narrower than complex. Comparisons between
numbers of mixed type use the same rule. [#]_ The constructors int,
long, float, and complex can be used to produce numbers
of a specific type.
All built-in numeric types support the following operations. See
power and later sections for the operators' priorities.
+--------------------+---------------------------------+--------+
| Operation | Result | Notes |
+====================+=================================+========+
| ``x + y`` | sum of {x} and {y} | |
+--------------------+---------------------------------+--------+
| ``x - y`` | difference of {x} and {y} | |
+--------------------+---------------------------------+--------+
| ``x { y`` | product of }x{ and }y* | |
+--------------------+---------------------------------+--------+
| ``x / y`` | quotient of {x} and {y} | \(1) |
+--------------------+---------------------------------+--------+
| ``x // y`` | (floored) quotient of {x} and | (4)(5) |
| | {y} | |
+--------------------+---------------------------------+--------+
| ``x % y`` | remainder of ``x / y`` | \(4) |
+--------------------+---------------------------------+--------+
| ``-x`` | {x} negated | |
+--------------------+---------------------------------+--------+
| ``+x`` | {x} unchanged | |
+--------------------+---------------------------------+--------+
| ``abs(x)`` | absolute value or magnitude of | \(3) |
| | {x} | |
+--------------------+---------------------------------+--------+
| ``int(x)`` | {x} converted to integer | \(2) |
+--------------------+---------------------------------+--------+
| ``long(x)`` | {x} converted to long integer | \(2) |
+--------------------+---------------------------------+--------+
| ``float(x)`` | {x} converted to floating point | \(6) |
+--------------------+---------------------------------+--------+
| ``complex(re,im)`` | a complex number with real part | |
| | {re}, imaginary part {im}. | |
| | {im} defaults to zero. | |
+--------------------+---------------------------------+--------+
| ``c.conjugate()`` | conjugate of the complex number | |
| | {c}. (Identity on real numbers) | |
+--------------------+---------------------------------+--------+
| ``divmod(x, y)`` | the pair ``(x // y, x % y)`` | (3)(4) |
+--------------------+---------------------------------+--------+
| ``pow(x, y)`` | {x} to the power {y} | (3)(7) |
+--------------------+---------------------------------+--------+
| ``x { y`` | }x{ to the power }y* | \(7) |
+--------------------+---------------------------------+--------+
.. index::
triple: operations on; numeric; types
single: conjugate() (complex number method)
Notes:
(1)
.. index::
pair: integer; division
triple: long; integer; division
For (plain or long) integer division, the result is an integer. The result is
always rounded towards minus infinity: 1/2 is 0, (-1)/2 is -1, 1/(-2) is -1, and
(-1)/(-2) is 0. Note that the result is a long integer if either operand is a
long integer, regardless of the numeric value.
(2)
.. index::
module: math
single: floor() (in module math)
single: ceil() (in module math)
single: trunc() (in module math)
pair: numeric; conversions
Conversion from floats using int or long truncates toward
zero like the related function, math.trunc. Use the function
math.floor to round downward and math.ceil to round
upward.
(3)
See built-in-funcs for a full description.
(4)
Complex floor division operator, modulo operator, and divmod.
2.3~
Instead convert to float using abs if appropriate.
(5)
Also referred to as integer division. The resultant value is a whole integer,
though the result's type is not necessarily int.
(6)
float also accepts the strings "nan" and "inf" with an optional prefix "+"
or "-" for Not a Number (NaN) and positive or negative infinity.
.. versionadded:: 2.6
(7)
Python defines ``pow(0, 0)`` and ``0 {} 0`` to be ``1``, as is common for
programming languages.
All numbers.Real types (int, long, and
float) also include the following operations:
+--------------------+------------------------------------+--------+
| Operation | Result | Notes |
+====================+====================================+========+
| ``math.trunc(x)`` | {x} truncated to Integral | |
+--------------------+------------------------------------+--------+
| ``round(x[, n])`` | {x} rounded to n digits, | |
| | rounding half to even. If n is | |
| | omitted, it defaults to 0. | |
+--------------------+------------------------------------+--------+
| ``math.floor(x)`` | the greatest integral float <= {x} | |
+--------------------+------------------------------------+--------+
| ``math.ceil(x)`` | the least integral float >= {x} | |
+--------------------+------------------------------------+--------+
.. XXXJH exceptions: overflow (when? what operations?) zerodivision
Bit-string Operations on Integer Types
--------------------------------------
.. index::
triple: operations on; integer; types
pair: bit-string; operations
pair: shifting; operations
pair: masking; operations
operator: ^
operator: &
operator: <<
operator: >>
Plain and long integer types support additional operations that make sense only
for bit-strings. Negative numbers are treated as their 2's complement value
(for long integers, this assumes a sufficiently large number of bits that no
overflow occurs during the operation).
The priorities of the binary bitwise operations are all lower than the numeric
operations and higher than the comparisons; the unary operation ``~`` has the
same priority as the other unary numeric operations (``+`` and ``-``).
This table lists the bit-string operations sorted in ascending priority:
+------------+--------------------------------+----------+
| Operation | Result | Notes |
+============+================================+==========+
| ``x | y`` | bitwise or of {x} and | |
| | {y} | |
+------------+--------------------------------+----------+
| ``x ^ y`` | bitwise exclusive or of | |
| | {x} and {y} | |
+------------+--------------------------------+----------+
| ``x & y`` | bitwise and of {x} and | |
| | {y} | |
+------------+--------------------------------+----------+
| ``x << n`` | {x} shifted left by {n} bits | (1)(2) |
+------------+--------------------------------+----------+
| ``x >> n`` | {x} shifted right by {n} bits | (1)(3) |
+------------+--------------------------------+----------+
| ``~x`` | the bits of {x} inverted | |
+------------+--------------------------------+----------+
Notes:
(1)
Negative shift counts are illegal and cause a ValueError to be raised.
(2)
A left shift by {n} bits is equivalent to multiplication by ``pow(2, n)``. A
long integer is returned if the result exceeds the range of plain integers.
(3)
A right shift by {n} bits is equivalent to division by ``pow(2, n)``.
Additional Methods on Integer Types
-----------------------------------
int.bit_length()~
long.bit_length()~
Return the number of bits necessary to represent an integer in binary,
excluding the sign and leading zeros:: >
>>> n = -37
>>> bin(n)
'-0b100101'
>>> n.bit_length()
6
<
More precisely, if ``x`` is nonzero, then ``x.bit_length()`` is the
unique positive integer ``k`` such that ``2{(k-1) <= abs(x) < 2}*k``.
Equivalently, when ``abs(x)`` is small enough to have a correctly
rounded logarithm, then ``k = 1 + int(log(abs(x), 2))``.
If ``x`` is zero, then ``x.bit_length()`` returns ``0``.
Equivalent to:: >
def bit_length(self):
s = bin(self) # binary representation: bin(-37) --> '-0b100101'
s = s.lstrip('-0b') # remove leading zeros and minus sign
return len(s) # len('100101') --> 6
<
.. versionadded:: 2.7
Additional Methods on Float
---------------------------
The float type has some additional methods.
float.as_integer_ratio()~
Return a pair of integers whose ratio is exactly equal to the
original float and with a positive denominator. Raises
OverflowError on infinities and a ValueError on
NaNs.
.. versionadded:: 2.6
Two methods support conversion to
and from hexadecimal strings. Since Python's floats are stored
internally as binary numbers, converting a float to or from a
{decimal} string usually involves a small rounding error. In
contrast, hexadecimal strings allow exact representation and
specification of floating-point numbers. This can be useful when
debugging, and in numerical work.
float.hex()~
Return a representation of a floating-point number as a hexadecimal
string. For finite floating-point numbers, this representation
will always include a leading ``0x`` and a trailing ``p`` and
exponent.
.. versionadded:: 2.6
float.fromhex(s)~
Class method to return the float represented by a hexadecimal
string {s}. The string {s} may have leading and trailing
whitespace.
.. versionadded:: 2.6
Note that float.hex is an instance method, while
float.fromhex is a class method.
A hexadecimal string takes the form:: >
[sign] ['0x'] integer ['.' fraction] ['p' exponent]
<
where the optional ``sign`` may by either ``+`` or ``-``, ``integer``
and ``fraction`` are strings of hexadecimal digits, and ``exponent``
is a decimal integer with an optional leading sign. Case is not
significant, and there must be at least one hexadecimal digit in
either the integer or the fraction. This syntax is similar to the
syntax specified in section 6.4.4.2 of the C99 standard, and also to
the syntax used in Java 1.5 onwards. In particular, the output of
float.hex is usable as a hexadecimal floating-point literal in
C or Java code, and hexadecimal strings produced by C's ``%a`` format
character or Java's ``Double.toHexString`` are accepted by
float.fromhex.
Note that the exponent is written in decimal rather than hexadecimal,
and that it gives the power of 2 by which to multiply the coefficient.
For example, the hexadecimal string ``0x3.a7p10`` represents the
floating-point number ``(3 + 10./16 + 7./16{2) } 2.0{}10``, or
``3740.0``:: >
>>> float.fromhex('0x3.a7p10')
3740.0
<
Applying the reverse conversion to ``3740.0`` gives a different
hexadecimal string representing the same number:: >
>>> float.hex(3740.0)
'0x1.d380000000000p+11'
<
Iterator Types
.. versionadded:: 2.2
.. index::
single: iterator protocol
single: protocol; iterator
single: sequence; iteration
single: container; iteration over
Python supports a concept of iteration over containers. This is implemented
using two distinct methods; these are used to allow user-defined classes to
support iteration. Sequences, described below in more detail, always support
the iteration methods.
One method needs to be defined for container objects to provide iteration
support:
.. XXX duplicated in reference/datamodel!
container.__iter__()~
Return an iterator object. The object is required to support the iterator
protocol described below. If a container supports different types of
iteration, additional methods can be provided to specifically request
iterators for those iteration types. (An example of an object supporting
multiple forms of iteration would be a tree structure which supports both
breadth-first and depth-first traversal.) This method corresponds to the
tp_iter slot of the type structure for Python objects in the Python/C
API.
The iterator objects themselves are required to support the following two
methods, which together form the iterator protocol:
iterator.__iter__()~
Return the iterator object itself. This is required to allow both containers
and iterators to be used with the for and in statements.
This method corresponds to the tp_iter slot of the type structure for
Python objects in the Python/C API.
iterator.next()~
Return the next item from the container. If there are no further items, raise
the StopIteration exception. This method corresponds to the
tp_iternext slot of the type structure for Python objects in the
Python/C API.
Python defines several iterator objects to support iteration over general and
specific sequence types, dictionaries, and other more specialized forms. The
specific types are not important beyond their implementation of the iterator
protocol.
The intention of the protocol is that once an iterator's next method
raises StopIteration, it will continue to do so on subsequent calls.
Implementations that do not obey this property are deemed broken. (This
constraint was added in Python 2.3; in Python 2.2, various iterators are broken
according to this rule.)
Generator Types
---------------
Python's generator\s provide a convenient way to implement the iterator
protocol. If a container object's __iter__ method is implemented as a
generator, it will automatically return an iterator object (technically, a
generator object) supplying the __iter__ and next methods. More
information about generators can be found in :ref:`the documentation for the
yield expression <yieldexpr>`.
Sequence Types --- str, unicode, list, tuple, buffer, xrange
==================================================================================================================
There are six sequence types: strings, Unicode strings, lists, tuples, buffers,
and xrange objects.
For other containers see the built in dict and set classes,
and the collections (|py2stdlib-collections|) module.
.. index::
object: sequence
object: string
object: Unicode
object: tuple
object: list
object: buffer
object: xrange
String literals are written in single or double quotes: ``'xyzzy'``,
``"frobozz"``. See strings for more about string literals.
Unicode strings are much like strings, but are specified in the syntax
using a preceding ``'u'`` character: ``u'abc'``, ``u"def"``. In addition
to the functionality described here, there are also string-specific
methods described in the string-methods section. Lists are
constructed with square brackets, separating items with commas: ``[a, b, c]``.
Tuples are constructed by the comma operator (not within square
brackets), with or without enclosing parentheses, but an empty tuple
must have the enclosing parentheses, such as ``a, b, c`` or ``()``. A
single item tuple must have a trailing comma, such as ``(d,)``.
Buffer objects are not directly supported by Python syntax, but can be created
by calling the built-in function buffer. They don't support
concatenation or repetition.
Objects of type xrange are similar to buffers in that there is no specific syntax to
create them, but they are created using the xrange function. They don't
support slicing, concatenation or repetition, and using ``in``, ``not in``,
min or max on them is inefficient.
Most sequence types support the following operations. The ``in`` and ``not in``
operations have the same priorities as the comparison operations. The ``+`` and
``*`` operations have the same priority as the corresponding numeric operations.
[#]_ Additional methods are provided for typesseq-mutable.
This table lists the sequence operations sorted in ascending priority
(operations in the same box have the same priority). In the table, {s} and {t}
are sequences of the same type; {n}, {i} and {j} are integers:
+------------------+--------------------------------+----------+
| Operation | Result | Notes |
+==================+================================+==========+
| ``x in s`` | ``True`` if an item of {s} is | \(1) |
| | equal to {x}, else ``False`` | |
+------------------+--------------------------------+----------+
| ``x not in s`` | ``False`` if an item of {s} is | \(1) |
| | equal to {x}, else ``True`` | |
+------------------+--------------------------------+----------+
| ``s + t`` | the concatenation of {s} and | \(6) |
| | {t} | |
+------------------+--------------------------------+----------+
| ``s { n, n } s`` | {n} shallow copies of {s} | \(2) |
| | concatenated | |
+------------------+--------------------------------+----------+
| ``s[i]`` | {i}'th item of {s}, origin 0 | \(3) |
+------------------+--------------------------------+----------+
| ``s[i:j]`` | slice of {s} from {i} to {j} | (3)(4) |
+------------------+--------------------------------+----------+
| ``s[i:j:k]`` | slice of {s} from {i} to {j} | (3)(5) |
| | with step {k} | |
+------------------+--------------------------------+----------+
| ``len(s)`` | length of {s} | |
+------------------+--------------------------------+----------+
| ``min(s)`` | smallest item of {s} | |
+------------------+--------------------------------+----------+
| ``max(s)`` | largest item of {s} | |
+------------------+--------------------------------+----------+
Sequence types also support comparisons. In particular, tuples and lists
are compared lexicographically by comparing corresponding
elements. This means that to compare equal, every element must compare
equal and the two sequences must be of the same type and have the same
length. (For full details see comparisons in the language
reference.)
.. index::
triple: operations on; sequence; types
builtin: len
builtin: min
builtin: max
pair: concatenation; operation
pair: repetition; operation
pair: subscript; operation
pair: slice; operation
pair: extended slice; operation
operator: in
operator: not in
Notes:
(1)
When {s} is a string or Unicode string object the ``in`` and ``not in``
operations act like a substring test. In Python versions before 2.3, {x} had to
be a string of length 1. In Python 2.3 and beyond, {x} may be a string of any
length.
(2)
Values of {n} less than ``0`` are treated as ``0`` (which yields an empty
sequence of the same type as {s}). Note also that the copies are shallow;
nested structures are not copied. This often haunts new Python programmers;
consider:
>>> lists = [[]] * 3
>>> lists
[[], [], []]
>>> lists[0].append(3)
>>> lists
[[3], [3], [3]]
What has happened is that ``[[]]`` is a one-element list containing an empty
list, so all three elements of ``[[]] * 3`` are (pointers to) this single empty
list. Modifying any of the elements of ``lists`` modifies this single list.
You can create a list of different lists this way:
>>> lists = [[] for i in range(3)]
>>> lists[0].append(3)
>>> lists[1].append(5)
>>> lists[2].append(7)
>>> lists
[[3], [5], [7]]
(3)
If {i} or {j} is negative, the index is relative to the end of the string:
``len(s) + i`` or ``len(s) + j`` is substituted. But note that ``-0`` is still
``0``.
(4)
The slice of {s} from {i} to {j} is defined as the sequence of items with index
{k} such that ``i <= k < j``. If {i} or {j} is greater than ``len(s)``, use
``len(s)``. If {i} is omitted or ``None``, use ``0``. If {j} is omitted or
``None``, use ``len(s)``. If {i} is greater than or equal to {j}, the slice is
empty.
(5)
The slice of {s} from {i} to {j} with step {k} is defined as the sequence of
items with index ``x = i + n*k`` such that ``0 <= n < (j-i)/k``. In other words,
the indices are ``i``, ``i+k``, ``i+2{k``, ``i+3}k`` and so on, stopping when
{j} is reached (but never including {j}). If {i} or {j} is greater than
``len(s)``, use ``len(s)``. If {i} or {j} are omitted or ``None``, they become
"end" values (which end depends on the sign of {k}). Note, {k} cannot be zero.
If {k} is ``None``, it is treated like ``1``.
(6)
.. impl-detail:: >
If {s} and {t} are both strings, some Python implementations such as
CPython can usually perform an in-place optimization for assignments of
the form ``s = s + t`` or ``s += t``. When applicable, this optimization
makes quadratic run-time much less likely. This optimization is both
version and implementation dependent. For performance sensitive code, it
is preferable to use the str.join method which assures consistent
linear concatenation performance across versions and implementations.
.. versionchanged:: 2.4
Formerly, string concatenation never occurred in-place.
<
String Methods
.. index:: pair: string; methods
Below are listed the string methods which both 8-bit strings and
Unicode objects support.
In addition, Python's strings support the sequence type methods
described in the typesseq section. To output formatted strings
use template strings or the ``%`` operator described in the
string-formatting section. Also, see the re (|py2stdlib-re|) module for
string functions based on regular expressions.
str.capitalize()~
Return a copy of the string with only its first character capitalized.
For 8-bit strings, this method is locale-dependent.
str.center(width[, fillchar])~
Return centered in a string of length {width}. Padding is done using the
specified {fillchar} (default is a space).
.. versionchanged:: 2.4
Support for the {fillchar} argument.
str.count(sub[, start[, end]])~
Return the number of non-overlapping occurrences of substring {sub} in the
range [{start}, {end}]. Optional arguments {start} and {end} are
interpreted as in slice notation.
str.decode([encoding[, errors]])~
Decodes the string using the codec registered for {encoding}. {encoding}
defaults to the default string encoding. {errors} may be given to set a
different error handling scheme. The default is ``'strict'``, meaning that
encoding errors raise UnicodeError. Other possible values are
``'ignore'``, ``'replace'`` and any other name registered via
codecs.register_error, see section codec-base-classes.
.. versionadded:: 2.2
.. versionchanged:: 2.3
Support for other error handling schemes added.
.. versionchanged:: 2.7
Support for keyword arguments added.
str.encode([encoding[,errors]])~
Return an encoded version of the string. Default encoding is the current
default string encoding. {errors} may be given to set a different error
handling scheme. The default for {errors} is ``'strict'``, meaning that
encoding errors raise a UnicodeError. Other possible values are
``'ignore'``, ``'replace'``, ``'xmlcharrefreplace'``, ``'backslashreplace'`` and
any other name registered via codecs.register_error, see section
codec-base-classes. For a list of possible encodings, see section
standard-encodings.
.. versionadded:: 2.0
.. versionchanged:: 2.3
Support for ``'xmlcharrefreplace'`` and ``'backslashreplace'`` and other error
handling schemes added.
.. versionchanged:: 2.7
Support for keyword arguments added.
str.endswith(suffix[, start[, end]])~
Return ``True`` if the string ends with the specified {suffix}, otherwise return
``False``. {suffix} can also be a tuple of suffixes to look for. With optional
{start}, test beginning at that position. With optional {end}, stop comparing
at that position.
.. versionchanged:: 2.5
Accept tuples as {suffix}.
str.expandtabs([tabsize])~
Return a copy of the string where all tab characters are replaced by one or
more spaces, depending on the current column and the given tab size. The
column number is reset to zero after each newline occurring in the string.
If {tabsize} is not given, a tab size of ``8`` characters is assumed. This
doesn't understand other non-printing characters or escape sequences.
str.find(sub[, start[, end]])~
Return the lowest index in the string where substring {sub} is found, such
that {sub} is contained in the slice ``s[start:end]``. Optional arguments
{start} and {end} are interpreted as in slice notation. Return ``-1`` if
{sub} is not found.
str.format({args, }*kwargs)~
Perform a string formatting operation. The string on which this method is
called can contain literal text or replacement fields delimited by braces
``{}``. Each replacement field contains either the numeric index of a
positional argument, or the name of a keyword argument. Returns a copy of
the string where each replacement field is replaced with the string value of
the corresponding argument.
>>> "The sum of 1 + 2 is {0}".format(1+2)
'The sum of 1 + 2 is 3'
See formatstrings for a description of the various formatting options
that can be specified in format strings.
This method of string formatting is the new standard in Python 3.0, and
should be preferred to the ``%`` formatting described in
string-formatting in new code.
.. versionadded:: 2.6
str.index(sub[, start[, end]])~
Like find, but raise ValueError when the substring is not found.
str.isalnum()~
Return true if all characters in the string are alphanumeric and there is at
least one character, false otherwise.
For 8-bit strings, this method is locale-dependent.
str.isalpha()~
Return true if all characters in the string are alphabetic and there is at least
one character, false otherwise.
For 8-bit strings, this method is locale-dependent.
str.isdigit()~
Return true if all characters in the string are digits and there is at least one
character, false otherwise.
For 8-bit strings, this method is locale-dependent.
str.islower()~
Return true if all cased characters in the string are lowercase and there is at
least one cased character, false otherwise.
For 8-bit strings, this method is locale-dependent.
str.isspace()~
Return true if there are only whitespace characters in the string and there is
at least one character, false otherwise.
For 8-bit strings, this method is locale-dependent.
str.istitle()~
Return true if the string is a titlecased string and there is at least one
character, for example uppercase characters may only follow uncased characters
and lowercase characters only cased ones. Return false otherwise.
For 8-bit strings, this method is locale-dependent.
str.isupper()~
Return true if all cased characters in the string are uppercase and there is at
least one cased character, false otherwise.
For 8-bit strings, this method is locale-dependent.
str.join(iterable)~
Return a string which is the concatenation of the strings in the
iterable {iterable}. The separator between elements is the string
providing this method.
str.ljust(width[, fillchar])~
Return the string left justified in a string of length {width}. Padding is done
using the specified {fillchar} (default is a space). The original string is
returned if {width} is less than ``len(s)``.
.. versionchanged:: 2.4
Support for the {fillchar} argument.
str.lower()~
Return a copy of the string converted to lowercase.
For 8-bit strings, this method is locale-dependent.
str.lstrip([chars])~
Return a copy of the string with leading characters removed. The {chars}
argument is a string specifying the set of characters to be removed. If omitted
or ``None``, the {chars} argument defaults to removing whitespace. The {chars}
argument is not a prefix; rather, all combinations of its values are stripped:
>>> ' spacious '.lstrip()
'spacious '
>>> 'www.example.com'.lstrip('cmowz.')
'example.com'
.. versionchanged:: 2.2.2
Support for the {chars} argument.
str.partition(sep)~
Split the string at the first occurrence of {sep}, and return a 3-tuple
containing the part before the separator, the separator itself, and the part
after the separator. If the separator is not found, return a 3-tuple containing
the string itself, followed by two empty strings.
.. versionadded:: 2.5
str.replace(old, new[, count])~
Return a copy of the string with all occurrences of substring {old} replaced by
{new}. If the optional argument {count} is given, only the first {count}
occurrences are replaced.
str.rfind(sub [,start [,end]])~
Return the highest index in the string where substring {sub} is found, such
that {sub} is contained within ``s[start:end]``. Optional arguments {start}
and {end} are interpreted as in slice notation. Return ``-1`` on failure.
str.rindex(sub[, start[, end]])~
Like rfind but raises ValueError when the substring {sub} is not
found.
str.rjust(width[, fillchar])~
Return the string right justified in a string of length {width}. Padding is done
using the specified {fillchar} (default is a space). The original string is
returned if {width} is less than ``len(s)``.
.. versionchanged:: 2.4
Support for the {fillchar} argument.
str.rpartition(sep)~
Split the string at the last occurrence of {sep}, and return a 3-tuple
containing the part before the separator, the separator itself, and the part
after the separator. If the separator is not found, return a 3-tuple containing
two empty strings, followed by the string itself.
.. versionadded:: 2.5
str.rsplit([sep [,maxsplit]])~
Return a list of the words in the string, using {sep} as the delimiter string.
If {maxsplit} is given, at most {maxsplit} splits are done, the {rightmost}
ones. If {sep} is not specified or ``None``, any whitespace string is a
separator. Except for splitting from the right, rsplit behaves like
split which is described in detail below.
.. versionadded:: 2.4
str.rstrip([chars])~
Return a copy of the string with trailing characters removed. The {chars}
argument is a string specifying the set of characters to be removed. If omitted
or ``None``, the {chars} argument defaults to removing whitespace. The {chars}
argument is not a suffix; rather, all combinations of its values are stripped:
>>> ' spacious '.rstrip()
' spacious'
>>> 'mississippi'.rstrip('ipz')
'mississ'
.. versionchanged:: 2.2.2
Support for the {chars} argument.
str.split([sep[, maxsplit]])~
Return a list of the words in the string, using {sep} as the delimiter
string. If {maxsplit} is given, at most {maxsplit} splits are done (thus,
the list will have at most ``maxsplit+1`` elements). If {maxsplit} is not
specified, then there is no limit on the number of splits (all possible
splits are made).
If {sep} is given, consecutive delimiters are not grouped together and are
deemed to delimit empty strings (for example, ``'1,,2'.split(',')`` returns
``['1', '', '2']``). The {sep} argument may consist of multiple characters
(for example, ``'1<>2<>3'.split('<>')`` returns ``['1', '2', '3']``).
Splitting an empty string with a specified separator returns ``['']``.
If {sep} is not specified or is ``None``, a different splitting algorithm is
applied: runs of consecutive whitespace are regarded as a single separator,
and the result will contain no empty strings at the start or end if the
string has leading or trailing whitespace. Consequently, splitting an empty
string or a string consisting of just whitespace with a ``None`` separator
returns ``[]``.
For example, ``' 1 2 3 '.split()`` returns ``['1', '2', '3']``, and
``' 1 2 3 '.split(None, 1)`` returns ``['1', '2 3 ']``.
str.splitlines([keepends])~
Return a list of the lines in the string, breaking at line boundaries. Line
breaks are not included in the resulting list unless {keepends} is given and
true.
str.startswith(prefix[, start[, end]])~
Return ``True`` if string starts with the {prefix}, otherwise return ``False``.
{prefix} can also be a tuple of prefixes to look for. With optional {start},
test string beginning at that position. With optional {end}, stop comparing
string at that position.
.. versionchanged:: 2.5
Accept tuples as {prefix}.
str.strip([chars])~
Return a copy of the string with the leading and trailing characters removed.
The {chars} argument is a string specifying the set of characters to be removed.
If omitted or ``None``, the {chars} argument defaults to removing whitespace.
The {chars} argument is not a prefix or suffix; rather, all combinations of its
values are stripped:
>>> ' spacious '.strip()
'spacious'
>>> 'www.example.com'.strip('cmowz.')
'example'
.. versionchanged:: 2.2.2
Support for the {chars} argument.
str.swapcase()~
Return a copy of the string with uppercase characters converted to lowercase and
vice versa.
For 8-bit strings, this method is locale-dependent.
str.title()~
Return a titlecased version of the string where words start with an uppercase
character and the remaining characters are lowercase.
The algorithm uses a simple language-independent definition of a word as
groups of consecutive letters. The definition works in many contexts but
it means that apostrophes in contractions and possessives form word
boundaries, which may not be the desired result:: >
>>> "they're bill's friends from the UK".title()
"They'Re Bill'S Friends From The Uk"
<
A workaround for apostrophes can be constructed using regular expressions::
>>> import re
>>> def titlecase(s):
return re.sub(r"[A-Za-z]+('[A-Za-z]+)?",
lambda mo: mo.group(0)[0].upper() +
mo.group(0)[1:].lower(),
s)
>>> titlecase("they're bill's friends.")
"They're Bill's Friends."
For 8-bit strings, this method is locale-dependent.
str.translate(table[, deletechars])~
Return a copy of the string where all characters occurring in the optional
argument {deletechars} are removed, and the remaining characters have been
mapped through the given translation table, which must be a string of length
256.
You can use the string.maketrans helper function in the string (|py2stdlib-string|)
module to create a translation table. For string objects, set the {table}
argument to ``None`` for translations that only delete characters:
>>> 'read this short text'.translate(None, 'aeiou')
'rd ths shrt txt'
.. versionadded:: 2.6
Support for a ``None`` {table} argument.
For Unicode objects, the translate method does not accept the optional
{deletechars} argument. Instead, it returns a copy of the {s} where all
characters have been mapped through the given translation table which must be a
mapping of Unicode ordinals to Unicode ordinals, Unicode strings or ``None``.
Unmapped characters are left untouched. Characters mapped to ``None`` are
deleted. Note, a more flexible approach is to create a custom character mapping
codec using the codecs (|py2stdlib-codecs|) module (see encodings.cp1251 for an
example).
str.upper()~
Return a copy of the string converted to uppercase.
For 8-bit strings, this method is locale-dependent.
str.zfill(width)~
Return the numeric string left filled with zeros in a string of length
{width}. A sign prefix is handled correctly. The original string is
returned if {width} is less than ``len(s)``.
.. versionadded:: 2.2.2
The following methods are present only on unicode objects:
unicode.isnumeric()~
Return ``True`` if there are only numeric characters in S, ``False``
otherwise. Numeric characters include digit characters, and all characters
that have the Unicode numeric value property, e.g. U+2155,
VULGAR FRACTION ONE FIFTH.
unicode.isdecimal()~
Return ``True`` if there are only decimal characters in S, ``False``
otherwise. Decimal characters include digit characters, and all characters
that that can be used to form decimal-radix numbers, e.g. U+0660,
ARABIC-INDIC DIGIT ZERO.
String Formatting Operations
----------------------------
.. index::
single: formatting, string (%)
single: interpolation, string (%)
single: string; formatting
single: string; interpolation
single: printf-style formatting
single: sprintf-style formatting
single: % formatting
single: % interpolation
String and Unicode objects have one unique built-in operation: the ``%``
operator (modulo). This is also known as the string {formatting} or
{interpolation} operator. Given ``format % values`` (where {format} is a string
or Unicode object), ``%`` conversion specifications in {format} are replaced
with zero or more elements of {values}. The effect is similar to the using
sprintf in the C language. If {format} is a Unicode object, or if any
of the objects being converted using the ``%s`` conversion are Unicode objects,
the result will also be a Unicode object.
If {format} requires a single argument, {values} may be a single non-tuple
object. [#]_ Otherwise, {values} must be a tuple with exactly the number of
items specified by the format string, or a single mapping object (for example, a
dictionary).
A conversion specifier contains two or more characters and has the following
components, which must occur in this order:
#. The ``'%'`` character, which marks the start of the specifier.
#. Mapping key (optional), consisting of a parenthesised sequence of characters
(for example, ``(somename)``).
#. Conversion flags (optional), which affect the result of some conversion
types.
#. Minimum field width (optional). If specified as an ``'*'`` (asterisk), the
actual width is read from the next element of the tuple in {values}, and the
object to convert comes after the minimum field width and optional precision.
#. Precision (optional), given as a ``'.'`` (dot) followed by the precision. If
specified as ``'*'`` (an asterisk), the actual width is read from the next
element of the tuple in {values}, and the value to convert comes after the
precision.
#. Length modifier (optional).
#. Conversion type.
When the right argument is a dictionary (or other mapping type), then the
formats in the string {must} include a parenthesised mapping key into that
dictionary inserted immediately after the ``'%'`` character. The mapping key
selects the value to be formatted from the mapping. For example:
>>> print '%(language)s has %(#)03d quote types.' % \
... {'language': "Python", "#": 2}
Python has 002 quote types.
In this case no ``*`` specifiers may occur in a format (since they require a
sequential parameter list).
The conversion flag characters are:
+---------+---------------------------------------------------------------------+
| Flag | Meaning |
+=========+=====================================================================+
| ``'#'`` | The value conversion will use the "alternate form" (where defined |
| | below). |
+---------+---------------------------------------------------------------------+
| ``'0'`` | The conversion will be zero padded for numeric values. |
+---------+---------------------------------------------------------------------+
| ``'-'`` | The converted value is left adjusted (overrides the ``'0'`` |
| | conversion if both are given). |
+---------+---------------------------------------------------------------------+
| ``' '`` | (a space) A blank should be left before a positive number (or empty |
| | string) produced by a signed conversion. |
+---------+---------------------------------------------------------------------+
| ``'+'`` | A sign character (``'+'`` or ``'-'``) will precede the conversion |
| | (overrides a "space" flag). |
+---------+---------------------------------------------------------------------+
A length modifier (``h``, ``l``, or ``L``) may be present, but is ignored as it
is not necessary for Python -- so e.g. ``%ld`` is identical to ``%d``.
The conversion types are:
+------------+-----------------------------------------------------+-------+
| Conversion | Meaning | Notes |
+============+=====================================================+=======+
| ``'d'`` | Signed integer decimal. | |
+------------+-----------------------------------------------------+-------+
| ``'i'`` | Signed integer decimal. | |
+------------+-----------------------------------------------------+-------+
| ``'o'`` | Signed octal value. | \(1) |
+------------+-----------------------------------------------------+-------+
| ``'u'`` | Obsolete type -- it is identical to ``'d'``. | \(7) |
+------------+-----------------------------------------------------+-------+
| ``'x'`` | Signed hexadecimal (lowercase). | \(2) |
+------------+-----------------------------------------------------+-------+
| ``'X'`` | Signed hexadecimal (uppercase). | \(2) |
+------------+-----------------------------------------------------+-------+
| ``'e'`` | Floating point exponential format (lowercase). | \(3) |
+------------+-----------------------------------------------------+-------+
| ``'E'`` | Floating point exponential format (uppercase). | \(3) |
+------------+-----------------------------------------------------+-------+
| ``'f'`` | Floating point decimal format. | \(3) |
+------------+-----------------------------------------------------+-------+
| ``'F'`` | Floating point decimal format. | \(3) |
+------------+-----------------------------------------------------+-------+
| ``'g'`` | Floating point format. Uses lowercase exponential | \(4) |
| | format if exponent is less than -4 or not less than | |
| | precision, decimal format otherwise. | |
+------------+-----------------------------------------------------+-------+
| ``'G'`` | Floating point format. Uses uppercase exponential | \(4) |
| | format if exponent is less than -4 or not less than | |
| | precision, decimal format otherwise. | |
+------------+-----------------------------------------------------+-------+
| ``'c'`` | Single character (accepts integer or single | |
| | character string). | |
+------------+-----------------------------------------------------+-------+
| ``'r'`` | String (converts any Python object using | \(5) |
| | repr (|py2stdlib-repr|)). | |
+------------+-----------------------------------------------------+-------+
| ``'s'`` | String (converts any Python object using | \(6) |
| | str). | |
+------------+-----------------------------------------------------+-------+
| ``'%'`` | No argument is converted, results in a ``'%'`` | |
| | character in the result. | |
+------------+-----------------------------------------------------+-------+
Notes:
(1)
The alternate form causes a leading zero (``'0'``) to be inserted between
left-hand padding and the formatting of the number if the leading character
of the result is not already a zero.
(2)
The alternate form causes a leading ``'0x'`` or ``'0X'`` (depending on whether
the ``'x'`` or ``'X'`` format was used) to be inserted between left-hand padding
and the formatting of the number if the leading character of the result is not
already a zero.
(3)
The alternate form causes the result to always contain a decimal point, even if
no digits follow it.
The precision determines the number of digits after the decimal point and
defaults to 6.
(4)
The alternate form causes the result to always contain a decimal point, and
trailing zeroes are not removed as they would otherwise be.
The precision determines the number of significant digits before and after the
decimal point and defaults to 6.
(5)
The ``%r`` conversion was added in Python 2.0.
The precision determines the maximal number of characters used.
(6)
If the object or format provided is a unicode string, the resulting
string will also be unicode.
The precision determines the maximal number of characters used.
(7)
See 237.
Since Python strings have an explicit length, ``%s`` conversions do not assume
that ``'\0'`` is the end of the string.
.. XXX Examples?
.. versionchanged:: 2.7
``%f`` conversions for numbers whose absolute value is over 1e50 are no
longer replaced by ``%g`` conversions.
.. index::
module: string
module: re
Additional string operations are defined in standard modules string (|py2stdlib-string|) and
re (|py2stdlib-re|).
XRange Type
-----------
.. index:: object: xrange
The xrange type is an immutable sequence which is commonly used for
looping. The advantage of the xrange type is that an xrange
object will always take the same amount of memory, no matter the size of the
range it represents. There are no consistent performance advantages.
XRange objects have very little behavior: they only support indexing, iteration,
and the len function.
Mutable Sequence Types
----------------------
.. index::
triple: mutable; sequence; types
object: list
List objects support additional operations that allow in-place modification of
the object. Other mutable sequence types (when added to the language) should
also support these operations. Strings and tuples are immutable sequence types:
such objects cannot be modified once created. The following operations are
defined on mutable sequence types (where {x} is an arbitrary object):
+------------------------------+--------------------------------+---------------------+
| Operation | Result | Notes |
+==============================+================================+=====================+
| ``s[i] = x`` | item {i} of {s} is replaced by | |
| | {x} | |
+------------------------------+--------------------------------+---------------------+
| ``s[i:j] = t`` | slice of {s} from {i} to {j} | |
| | is replaced by the contents of | |
| | the iterable {t} | |
+------------------------------+--------------------------------+---------------------+
| ``del s[i:j]`` | same as ``s[i:j] = []`` | |
+------------------------------+--------------------------------+---------------------+
| ``s[i:j:k] = t`` | the elements of ``s[i:j:k]`` | \(1) |
| | are replaced by those of {t} | |
+------------------------------+--------------------------------+---------------------+
| ``del s[i:j:k]`` | removes the elements of | |
| | ``s[i:j:k]`` from the list | |
+------------------------------+--------------------------------+---------------------+
| ``s.append(x)`` | same as ``s[len(s):len(s)] = | \(2) |
| | [x]`` | |
+------------------------------+--------------------------------+---------------------+
| ``s.extend(x)`` | same as ``s[len(s):len(s)] = | \(3) |
| | x`` | |
+------------------------------+--------------------------------+---------------------+
| ``s.count(x)`` | return number of {i}'s for | |
| | which ``s[i] == x`` | |
+------------------------------+--------------------------------+---------------------+
| ``s.index(x[, i[, j]])`` | return smallest {k} such that | \(4) |
| | ``s[k] == x`` and ``i <= k < | |
| | j`` | |
+------------------------------+--------------------------------+---------------------+
| ``s.insert(i, x)`` | same as ``s[i:i] = [x]`` | \(5) |
+------------------------------+--------------------------------+---------------------+
| ``s.pop([i])`` | same as ``x = s[i]; del s[i]; | \(6) |
| | return x`` | |
+------------------------------+--------------------------------+---------------------+
| ``s.remove(x)`` | same as ``del s[s.index(x)]`` | \(4) |
+------------------------------+--------------------------------+---------------------+
| ``s.reverse()`` | reverses the items of {s} in | \(7) |
| | place | |
+------------------------------+--------------------------------+---------------------+
| ``s.sort([cmp[, key[, | sort the items of {s} in place | (7)(8)(9)(10) |
| reverse]]])`` | | |
+------------------------------+--------------------------------+---------------------+
.. index::
triple: operations on; sequence; types
triple: operations on; list; type
pair: subscript; assignment
pair: slice; assignment
pair: extended slice; assignment
statement: del
single: append() (list method)
single: extend() (list method)
single: count() (list method)
single: index() (list method)
single: insert() (list method)
single: pop() (list method)
single: remove() (list method)
single: reverse() (list method)
single: sort() (list method)
Notes:
(1)
{t} must have the same length as the slice it is replacing.
(2)
The C implementation of Python has historically accepted multiple parameters and
implicitly joined them into a tuple; this no longer works in Python 2.0. Use of
this misfeature has been deprecated since Python 1.4.
(3)
{x} can be any iterable object.
(4)
Raises ValueError when {x} is not found in {s}. When a negative index is
passed as the second or third parameter to the index method, the list
length is added, as for slice indices. If it is still negative, it is truncated
to zero, as for slice indices.
.. versionchanged:: 2.3
Previously, index didn't have arguments for specifying start and stop
positions.
(5)
When a negative index is passed as the first parameter to the insert
method, the list length is added, as for slice indices. If it is still
negative, it is truncated to zero, as for slice indices.
.. versionchanged:: 2.3
Previously, all negative indices were truncated to zero.
(6)
The pop method is only supported by the list and array types. The
optional argument {i} defaults to ``-1``, so that by default the last item is
removed and returned.
(7)
The sort and reverse methods modify the list in place for
economy of space when sorting or reversing a large list. To remind you that
they operate by side effect, they don't return the sorted or reversed list.
(8)
The sort method takes optional arguments for controlling the
comparisons.
{cmp} specifies a custom comparison function of two arguments (list items) which
should return a negative, zero or positive number depending on whether the first
argument is considered smaller than, equal to, or larger than the second
argument: ``cmp=lambda x,y: cmp(x.lower(), y.lower())``. The default value
is ``None``.
{key} specifies a function of one argument that is used to extract a comparison
key from each list element: ``key=str.lower``. The default value is ``None``.
{reverse} is a boolean value. If set to ``True``, then the list elements are
sorted as if each comparison were reversed.
In general, the {key} and {reverse} conversion processes are much faster than
specifying an equivalent {cmp} function. This is because {cmp} is called
multiple times for each list element while {key} and {reverse} touch each
element only once. Use functools.cmp_to_key to convert an
old-style {cmp} function to a {key} function.
.. versionchanged:: 2.3
Support for ``None`` as an equivalent to omitting {cmp} was added.
.. versionchanged:: 2.4
Support for {key} and {reverse} was added.
(9)
Starting with Python 2.3, the sort method is guaranteed to be stable. A
sort is stable if it guarantees not to change the relative order of elements
that compare equal --- this is helpful for sorting in multiple passes (for
example, sort by department, then by salary grade).
(10)
.. impl-detail:: >
While a list is being sorted, the effect of attempting to mutate, or even
inspect, the list is undefined. The C implementation of Python 2.3 and
newer makes the list appear empty for the duration, and raises
ValueError if it can detect that the list has been mutated during a
sort.
<
Set Types --- set, frozenset
.. index:: object: set
A set object is an unordered collection of distinct hashable objects.
Common uses include membership testing, removing duplicates from a sequence, and
computing mathematical operations such as intersection, union, difference, and
symmetric difference.
(For other containers see the built in dict, list,
and tuple classes, and the collections (|py2stdlib-collections|) module.)
.. versionadded:: 2.4
Like other collections, sets support ``x in set``, ``len(set)``, and ``for x in
set``. Being an unordered collection, sets do not record element position or
order of insertion. Accordingly, sets do not support indexing, slicing, or
other sequence-like behavior.
There are currently two built-in set types, set and frozenset.
The set type is mutable --- the contents can be changed using methods
like add and remove. Since it is mutable, it has no hash value
and cannot be used as either a dictionary key or as an element of another set.
The frozenset type is immutable and hashable --- its contents cannot be
altered after it is created; it can therefore be used as a dictionary key or as
an element of another set.
Non-empty sets (not frozensets) can be created by placing a comma-separated list
of elements within braces, for example: ``{'jack', 'sjoerd'}``, in addition to the
set constructor.
The constructors for both classes work the same:
set([iterable])~
frozenset([iterable])
Return a new set or frozenset object whose elements are taken from
{iterable}. The elements of a set must be hashable. To represent sets of
sets, the inner sets must be frozenset objects. If {iterable} is
not specified, a new empty set is returned.
Instances of set and frozenset provide the following
operations:
.. describe:: len(s)
Return the cardinality of set {s}.
.. describe:: x in s
Test {x} for membership in {s}.
.. describe:: x not in s
Test {x} for non-membership in {s}.
isdisjoint(other)~
Return True if the set has no elements in common with {other}. Sets are
disjoint if and only if their intersection is the empty set.
.. versionadded:: 2.6
issubset(other)~
set <= other
Test whether every element in the set is in {other}.
set < other~
Test whether the set is a true subset of {other}, that is,
``set <= other and set != other``.
issuperset(other)~
set >= other
Test whether every element in {other} is in the set.
set > other~
Test whether the set is a true superset of {other}, that is, ``set >=
other and set != other``.
union(other, ...)~
set | other | ...
Return a new set with elements from the set and all others.
.. versionchanged:: 2.6
Accepts multiple input iterables.
intersection(other, ...)~
set & other & ...
Return a new set with elements common to the set and all others.
.. versionchanged:: 2.6
Accepts multiple input iterables.
difference(other, ...)~
set - other - ...
Return a new set with elements in the set that are not in the others.
.. versionchanged:: 2.6
Accepts multiple input iterables.
symmetric_difference(other)~
set ^ other
Return a new set with elements in either the set or {other} but not both.
copy()~
Return a new set with a shallow copy of {s}.
Note, the non-operator versions of union, intersection,
difference, and symmetric_difference, issubset, and
issuperset methods will accept any iterable as an argument. In
contrast, their operator based counterparts require their arguments to be
sets. This precludes error-prone constructions like ``set('abc') & 'cbs'``
in favor of the more readable ``set('abc').intersection('cbs')``.
Both set and frozenset support set to set comparisons. Two
sets are equal if and only if every element of each set is contained in the
other (each is a subset of the other). A set is less than another set if and
only if the first set is a proper subset of the second set (is a subset, but
is not equal). A set is greater than another set if and only if the first set
is a proper superset of the second set (is a superset, but is not equal).
Instances of set are compared to instances of frozenset
based on their members. For example, ``set('abc') == frozenset('abc')``
returns ``True`` and so does ``set('abc') in set([frozenset('abc')])``.
The subset and equality comparisons do not generalize to a complete ordering
function. For example, any two disjoint sets are not equal and are not
subsets of each other, so {all} of the following return ``False``: ``a<b``,
``a==b``, or ``a>b``. Accordingly, sets do not implement the __cmp__
method.
Since sets only define partial ordering (subset relationships), the output of
the list.sort method is undefined for lists of sets.
Set elements, like dictionary keys, must be hashable.
Binary operations that mix set instances with frozenset
return the type of the first operand. For example: ``frozenset('ab') |
set('bc')`` returns an instance of frozenset.
The following table lists operations available for set that do not
apply to immutable instances of frozenset:
update(other, ...)~
set |= other | ...
Update the set, adding elements from all others.
.. versionchanged:: 2.6
Accepts multiple input iterables.
intersection_update(other, ...)~
set &= other & ...
Update the set, keeping only elements found in it and all others.
.. versionchanged:: 2.6
Accepts multiple input iterables.
difference_update(other, ...)~
set -= other | ...
Update the set, removing elements found in others.
.. versionchanged:: 2.6
Accepts multiple input iterables.
symmetric_difference_update(other)~
set ^= other
Update the set, keeping only elements found in either set, but not in both.
add(elem)~
Add element {elem} to the set.
remove(elem)~
Remove element {elem} from the set. Raises KeyError if {elem} is
not contained in the set.
discard(elem)~
Remove element {elem} from the set if it is present.
pop()~
Remove and return an arbitrary element from the set. Raises
KeyError if the set is empty.
clear()~
Remove all elements from the set.
Note, the non-operator versions of the update,
intersection_update, difference_update, and
symmetric_difference_update methods will accept any iterable as an
argument.
Note, the {elem} argument to the __contains__, remove, and
discard methods may be a set. To support searching for an equivalent
frozenset, the {elem} set is temporarily mutated during the search and then
restored. During the search, the {elem} set should not be read or mutated
since it does not have a meaningful value.
.. seealso::
comparison-to-builtin-set
Differences between the sets (|py2stdlib-sets|) module and the built-in set types.
Mapping Types --- dict
===============================
.. index::
object: mapping
object: dictionary
triple: operations on; mapping; types
triple: operations on; dictionary; type
statement: del
builtin: len
A mapping object maps hashable values to arbitrary objects.
Mappings are mutable objects. There is currently only one standard mapping
type, the dictionary. (For other containers see the built in
list, set, and tuple classes, and the
collections (|py2stdlib-collections|) module.)
A dictionary's keys are {almost} arbitrary values. Values that are not
hashable, that is, values containing lists, dictionaries or other
mutable types (that are compared by value rather than by object identity) may
not be used as keys. Numeric types used for keys obey the normal rules for
numeric comparison: if two numbers compare equal (such as ``1`` and ``1.0``)
then they can be used interchangeably to index the same dictionary entry. (Note
however, that since computers store floating-point numbers as approximations it
is usually unwise to use them as dictionary keys.)
Dictionaries can be created by placing a comma-separated list of ``key: value``
pairs within braces, for example: ``{'jack': 4098, 'sjoerd': 4127}`` or ``{4098:
'jack', 4127: 'sjoerd'}``, or by the dict constructor.
dict([arg])~
Return a new dictionary initialized from an optional positional argument or from
a set of keyword arguments. If no arguments are given, return a new empty
dictionary. If the positional argument {arg} is a mapping object, return a
dictionary mapping the same keys to the same values as does the mapping object.
Otherwise the positional argument must be a sequence, a container that supports
iteration, or an iterator object. The elements of the argument must each also
be of one of those kinds, and each must in turn contain exactly two objects.
The first is used as a key in the new dictionary, and the second as the key's
value. If a given key is seen more than once, the last value associated with it
is retained in the new dictionary.
If keyword arguments are given, the keywords themselves with their associated
values are added as items to the dictionary. If a key is specified both in the
positional argument and as a keyword argument, the value associated with the
keyword is retained in the dictionary. For example, these all return a
dictionary equal to ``{"one": 2, "two": 3}``:
* ``dict(one=2, two=3)``
* ``dict({'one': 2, 'two': 3})``
* ``dict(zip(('one', 'two'), (2, 3)))``
* ``dict([['two', 3], ['one', 2]])``
The first example only works for keys that are valid Python
identifiers; the others work with any valid keys.
.. versionadded:: 2.2
.. versionchanged:: 2.3
Support for building a dictionary from keyword arguments added.
These are the operations that dictionaries support (and therefore, custom
mapping types should support too):
.. describe:: len(d)
Return the number of items in the dictionary {d}.
.. describe:: d[key]
Return the item of {d} with key {key}. Raises a KeyError if {key}
is not in the map.
.. versionadded:: 2.5
If a subclass of dict defines a method __missing__, if the key
{key} is not present, the ``d[key]`` operation calls that method with
the key {key} as argument. The ``d[key]`` operation then returns or
raises whatever is returned or raised by the ``__missing__(key)`` call
if the key is not present. No other operations or methods invoke
__missing__. If __missing__ is not defined,
KeyError is raised. __missing__ must be a method; it
cannot be an instance variable. For an example, see
collections.defaultdict.
.. describe:: d[key] = value
Set ``d[key]`` to {value}.
.. describe:: del d[key]
Remove ``d[key]`` from {d}. Raises a KeyError if {key} is not in the
map.
.. describe:: key in d
Return ``True`` if {d} has a key {key}, else ``False``.
.. versionadded:: 2.2
.. describe:: key not in d
Equivalent to ``not key in d``.
.. versionadded:: 2.2
.. describe:: iter(d)
Return an iterator over the keys of the dictionary. This is a shortcut
for iterkeys.
clear()~
Remove all items from the dictionary.
copy()~
Return a shallow copy of the dictionary.
fromkeys(seq[, value])~
Create a new dictionary with keys from {seq} and values set to {value}.
fromkeys is a class method that returns a new dictionary. {value}
defaults to ``None``.
.. versionadded:: 2.3
get(key[, default])~
Return the value for {key} if {key} is in the dictionary, else {default}.
If {default} is not given, it defaults to ``None``, so that this method
never raises a KeyError.
has_key(key)~
Test for the presence of {key} in the dictionary. has_key is
deprecated in favor of ``key in d``.
items()~
Return a copy of the dictionary's list of ``(key, value)`` pairs.
.. impl-detail:: >
Keys and values are listed in an arbitrary order which is non-random,
varies across Python implementations, and depends on the dictionary's
history of insertions and deletions.
<
If items, keys, values, iteritems,
iterkeys, and itervalues are called with no intervening
modifications to the dictionary, the lists will directly correspond. This
allows the creation of ``(value, key)`` pairs using zip: ``pairs =
zip(d.values(), d.keys())``. The same relationship holds for the
iterkeys and itervalues methods: ``pairs =
zip(d.itervalues(), d.iterkeys())`` provides the same value for
``pairs``. Another way to create the same list is ``pairs = [(v, k) for
(k, v) in d.iteritems()]``.
iteritems()~
Return an iterator over the dictionary's ``(key, value)`` pairs. See the
note for dict.items.
Using iteritems while adding or deleting entries in the dictionary
may raise a RuntimeError or fail to iterate over all entries.
.. versionadded:: 2.2
iterkeys()~
Return an iterator over the dictionary's keys. See the note for
dict.items.
Using iterkeys while adding or deleting entries in the dictionary
may raise a RuntimeError or fail to iterate over all entries.
.. versionadded:: 2.2
itervalues()~
Return an iterator over the dictionary's values. See the note for
dict.items.
Using itervalues while adding or deleting entries in the
dictionary may raise a RuntimeError or fail to iterate over all
entries.
.. versionadded:: 2.2
keys()~
Return a copy of the dictionary's list of keys. See the note for
dict.items.
pop(key[, default])~
If {key} is in the dictionary, remove it and return its value, else return
{default}. If {default} is not given and {key} is not in the dictionary,
a KeyError is raised.
.. versionadded:: 2.3
popitem()~
Remove and return an arbitrary ``(key, value)`` pair from the dictionary.
popitem is useful to destructively iterate over a dictionary, as
often used in set algorithms. If the dictionary is empty, calling
popitem raises a KeyError.
setdefault(key[, default])~
If {key} is in the dictionary, return its value. If not, insert {key}
with a value of {default} and return {default}. {default} defaults to
``None``.
update([other])~
Update the dictionary with the key/value pairs from {other}, overwriting
existing keys. Return ``None``.
update accepts either another dictionary object or an iterable of
key/value pairs (as a tuple or other iterable of length two). If keyword
arguments are specified, the dictionary is then updated with those
key/value pairs: ``d.update(red=1, blue=2)``.
.. versionchanged:: 2.4
Allowed the argument to be an iterable of key/value pairs and allowed
keyword arguments.
values()~
Return a copy of the dictionary's list of values. See the note for
dict.items.
viewitems()~
Return a new view of the dictionary's items (``(key, value)`` pairs). See
below for documentation of view objects.
.. versionadded:: 2.7
viewkeys()~
Return a new view of the dictionary's keys. See below for documentation of
view objects.
.. versionadded:: 2.7
viewvalues()~
Return a new view of the dictionary's values. See below for documentation of
view objects.
.. versionadded:: 2.7
Dictionary view objects
-----------------------
The objects returned by dict.viewkeys, dict.viewvalues and
dict.viewitems are {view objects}. They provide a dynamic view on the
dictionary's entries, which means that when the dictionary changes, the view
reflects these changes.
Dictionary views can be iterated over to yield their respective data, and
support membership tests:
.. describe:: len(dictview)
Return the number of entries in the dictionary.
.. describe:: iter(dictview)
Return an iterator over the keys, values or items (represented as tuples of
``(key, value)``) in the dictionary.
Keys and values are iterated over in an arbitrary order which is non-random,
varies across Python implementations, and depends on the dictionary's history
of insertions and deletions. If keys, values and items views are iterated
over with no intervening modifications to the dictionary, the order of items
will directly correspond. This allows the creation of ``(value, key)`` pairs
using zip: ``pairs = zip(d.values(), d.keys())``. Another way to
create the same list is ``pairs = [(v, k) for (k, v) in d.items()]``.
Iterating views while adding or deleting entries in the dictionary may raise
a RuntimeError or fail to iterate over all entries.
.. describe:: x in dictview
Return ``True`` if {x} is in the underlying dictionary's keys, values or
items (in the latter case, {x} should be a ``(key, value)`` tuple).
Keys views are set-like since their entries are unique and hashable. If all
values are hashable, so that (key, value) pairs are unique and hashable, then
the items view is also set-like. (Values views are not treated as set-like
since the entries are generally not unique.) Then these set operations are
available ("other" refers either to another view or a set):
.. describe:: dictview & other
Return the intersection of the dictview and the other object as a new set.
.. describe:: dictview | other
Return the union of the dictview and the other object as a new set.
.. describe:: dictview - other
Return the difference between the dictview and the other object (all elements
in {dictview} that aren't in {other}) as a new set.
.. describe:: dictview ^ other
Return the symmetric difference (all elements either in {dictview} or
{other}, but not in both) of the dictview and the other object as a new set.
An example of dictionary view usage:: >
>>> dishes = {'eggs': 2, 'sausage': 1, 'bacon': 1, 'spam': 500}
>>> keys = dishes.viewkeys()
>>> values = dishes.viewvalues()
>>> # iteration
>>> n = 0
>>> for val in values:
... n += val
>>> print(n)
504
>>> # keys and values are iterated over in the same order
>>> list(keys)
['eggs', 'bacon', 'sausage', 'spam']
>>> list(values)
[2, 1, 1, 500]
>>> # view objects are dynamic and reflect dict changes
>>> del dishes['eggs']
>>> del dishes['sausage']
>>> list(keys)
['spam', 'bacon']
>>> # set operations
>>> keys & {'eggs', 'bacon', 'salad'}
{'bacon'}
<
File Objects
.. index::
object: file
builtin: file
module: os
module: socket
File objects are implemented using C's ``stdio`` package and can be
created with the built-in open function. File
objects are also returned by some other built-in functions and methods,
such as os.popen and os.fdopen and the makefile
method of socket objects. Temporary files can be created using the
tempfile (|py2stdlib-tempfile|) module, and high-level file operations such as copying,
moving, and deleting files and directories can be achieved with the
shutil (|py2stdlib-shutil|) module.
When a file operation fails for an I/O-related reason, the exception
IOError is raised. This includes situations where the operation is not
defined for some reason, like seek on a tty device or writing a file
opened for reading.
Files have the following methods:
file.close()~
Close the file. A closed file cannot be read or written any more. Any operation
which requires that the file be open will raise a ValueError after the
file has been closed. Calling close more than once is allowed.
As of Python 2.5, you can avoid having to call this method explicitly if you use
the with statement. For example, the following code will
automatically close {f} when the with block is exited:: >
from __future__ import with_statement # This isn't required in Python 2.6
with open("hello.txt") as f:
for line in f:
print line
<
In older versions of Python, you would have needed to do this to get the same
effect:: >
f = open("hello.txt")
try:
for line in f:
print line
finally:
f.close()
<
.. note::
Not all "file-like" types in Python support use as a context manager for the
with statement. If your code is intended to work with any file-like
object, you can use the function contextlib.closing instead of using
the object directly.
file.flush()~
Flush the internal buffer, like ``stdio``'s fflush. This may be a
no-op on some file-like objects.
.. note:: >
flush does not necessarily write the file's data to disk. Use
flush followed by os.fsync to ensure this behavior.
<
file.fileno()~
.. index::
pair: file; descriptor
module: fcntl
Return the integer "file descriptor" that is used by the underlying
implementation to request I/O operations from the operating system. This can be
useful for other, lower level interfaces that use file descriptors, such as the
fcntl (|py2stdlib-fcntl|) module or os.read and friends.
.. note:: >
File-like objects which do not have a real file descriptor should {not} provide
this method!
<
file.isatty()~
Return ``True`` if the file is connected to a tty(-like) device, else ``False``.
.. note:: >
If a file-like object is not associated with a real file, this method should
{not} be implemented.
<
file.next()~
A file object is its own iterator, for example ``iter(f)`` returns {f} (unless
{f} is closed). When a file is used as an iterator, typically in a
for loop (for example, ``for line in f: print line``), the
.next method is called repeatedly. This method returns the next input
line, or raises StopIteration when EOF is hit when the file is open for
reading (behavior is undefined when the file is open for writing). In order to
make a for loop the most efficient way of looping over the lines of a
file (a very common operation), the next method uses a hidden read-ahead
buffer. As a consequence of using a read-ahead buffer, combining .next
with other file methods (like readline (|py2stdlib-readline|)) does not work right. However,
using seek to reposition the file to an absolute position will flush the
read-ahead buffer.
.. versionadded:: 2.3
file.read([size])~
Read at most {size} bytes from the file (less if the read hits EOF before
obtaining {size} bytes). If the {size} argument is negative or omitted, read
all data until EOF is reached. The bytes are returned as a string object. An
empty string is returned when EOF is encountered immediately. (For certain
files, like ttys, it makes sense to continue reading after an EOF is hit.) Note
that this method may call the underlying C function fread more than
once in an effort to acquire as close to {size} bytes as possible. Also note
that when in non-blocking mode, less data than was requested may be
returned, even if no {size} parameter was given.
.. note::
This function is simply a wrapper for the underlying
fread C function, and will behave the same in corner cases,
such as whether the EOF value is cached.
file.readline([size])~
Read one entire line from the file. A trailing newline character is kept in the
string (but may be absent when a file ends with an incomplete line). [#]_ If
the {size} argument is present and non-negative, it is a maximum byte count
(including the trailing newline) and an incomplete line may be returned. An
empty string is returned {only} when EOF is encountered immediately.
.. note:: >
Unlike ``stdio``'s fgets, the returned string contains null characters
(``'\0'``) if they occurred in the input.
<
file.readlines([sizehint])~
Read until EOF using readline (|py2stdlib-readline|) and return a list containing the lines
thus read. If the optional {sizehint} argument is present, instead of
reading up to EOF, whole lines totalling approximately {sizehint} bytes
(possibly after rounding up to an internal buffer size) are read. Objects
implementing a file-like interface may choose to ignore {sizehint} if it
cannot be implemented, or cannot be implemented efficiently.
file.xreadlines()~
This method returns the same thing as ``iter(f)``.
.. versionadded:: 2.1
2.3~
Use ``for line in file`` instead.
file.seek(offset[, whence])~
Set the file's current position, like ``stdio``'s fseek. The {whence}
argument is optional and defaults to ``os.SEEK_SET`` or ``0`` (absolute file
positioning); other values are ``os.SEEK_CUR`` or ``1`` (seek relative to the
current position) and ``os.SEEK_END`` or ``2`` (seek relative to the file's
end). There is no return value.
For example, ``f.seek(2, os.SEEK_CUR)`` advances the position by two and
``f.seek(-3, os.SEEK_END)`` sets the position to the third to last.
Note that if the file is opened for appending
(mode ``'a'`` or ``'a+'``), any seek operations will be undone at the
next write. If the file is only opened for writing in append mode (mode
``'a'``), this method is essentially a no-op, but it remains useful for files
opened in append mode with reading enabled (mode ``'a+'``). If the file is
opened in text mode (without ``'b'``), only offsets returned by tell are
legal. Use of other offsets causes undefined behavior.
Note that not all file objects are seekable.
.. versionchanged:: 2.6
Passing float values as offset has been deprecated.
file.tell()~
Return the file's current position, like ``stdio``'s ftell.
.. note:: >
On Windows, tell can return illegal values (after an fgets)
when reading files with Unix-style line-endings. Use binary mode (``'rb'``) to
circumvent this problem.
<
file.truncate([size])~
Truncate the file's size. If the optional {size} argument is present, the file
is truncated to (at most) that size. The size defaults to the current position.
The current file position is not changed. Note that if a specified size exceeds
the file's current size, the result is platform-dependent: possibilities
include that the file may remain unchanged, increase to the specified size as if
zero-filled, or increase to the specified size with undefined new content.
Availability: Windows, many Unix variants.
file.write(str)~
Write a string to the file. There is no return value. Due to buffering, the
string may not actually show up in the file until the flush or
close method is called.
file.writelines(sequence)~
Write a sequence of strings to the file. The sequence can be any iterable
object producing strings, typically a list of strings. There is no return value.
(The name is intended to match readlines; writelines does not
add line separators.)
Files support the iterator protocol. Each iteration returns the same result as
``file.readline()``, and iteration ends when the readline (|py2stdlib-readline|) method returns
an empty string.
File objects also offer a number of other interesting attributes. These are not
required for file-like objects, but should be implemented if they make sense for
the particular object.
file.closed~
bool indicating the current state of the file object. This is a read-only
attribute; the close method changes the value. It may not be available
on all file-like objects.
file.encoding~
The encoding that this file uses. When Unicode strings are written to a file,
they will be converted to byte strings using this encoding. In addition, when
the file is connected to a terminal, the attribute gives the encoding that the
terminal is likely to use (that information might be incorrect if the user has
misconfigured the terminal). The attribute is read-only and may not be present
on all file-like objects. It may also be ``None``, in which case the file uses
the system default encoding for converting Unicode strings.
.. versionadded:: 2.3
file.errors~
The Unicode error handler used along with the encoding.
.. versionadded:: 2.6
file.mode~
The I/O mode for the file. If the file was created using the open
built-in function, this will be the value of the {mode} parameter. This is a
read-only attribute and may not be present on all file-like objects.
file.name~
If the file object was created using open, the name of the file.
Otherwise, some string that indicates the source of the file object, of the
form ``<...>``. This is a read-only attribute and may not be present on all
file-like objects.
file.newlines~
If Python was built with the --with-universal-newlines option to
configure (the default) this read-only attribute exists, and for
files opened in universal newline read mode it keeps track of the types of
newlines encountered while reading the file. The values it can take are
``'\r'``, ``'\n'``, ``'\r\n'``, ``None`` (unknown, no newlines read yet) or a
tuple containing all the newline types seen, to indicate that multiple newline
conventions were encountered. For files not opened in universal newline read
mode the value of this attribute will be ``None``.
file.softspace~
Boolean that indicates whether a space character needs to be printed before
another value when using the print statement. Classes that are trying
to simulate a file object should also have a writable softspace
attribute, which should be initialized to zero. This will be automatic for most
classes implemented in Python (care may be needed for objects that override
attribute access); types implemented in C will have to provide a writable
softspace attribute.
.. note:: >
This attribute is not used to control the print statement, but to
allow the implementation of print to keep track of its internal
state.
<
memoryview type
memoryview objects allow Python code to access the internal data
of an object that supports the buffer protocol without copying. Memory
is generally interpreted as simple bytes.
memoryview(obj)~
Create a memoryview that references {obj}. {obj} must support the
buffer protocol. Builtin objects that support the buffer protocol include
str and bytearray (but not unicode).
``len(view)`` returns the total number of bytes in the memoryview, {view}.
A memoryview supports slicing to expose its data. Taking a single
index will return a single byte. Full slicing will result in a subview:: >
>>> v = memoryview('abcefg')
>>> v[1]
'b'
>>> v[-1]
'g'
>>> v[1:4]
<memory at 0x77ab28>
>>> str(v[1:4])
'bce'
>>> v[3:-1]
<memory at 0x744f18>
>>> str(v[4:-1])
'f'
<
If the object the memory view is over supports changing its data, the
memoryview supports slice assignment:: >
>>> data = bytearray('abcefg')
>>> v = memoryview(data)
>>> v.readonly
False
>>> v[0] = 'z'
>>> data
bytearray(b'zbcefg')
>>> v[1:4] = '123'
>>> data
bytearray(b'z123fg')
>>> v[2] = 'spam'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: cannot modify size of memoryview object
<
Notice how the size of the memoryview object cannot be changed.
memoryview has two methods:
tobytes()~
Return the data in the buffer as a bytestring (an object of class
str).
tolist()~
Return the data in the buffer as a list of integers. :: >
>>> memoryview(b'abc').tolist()
[97, 98, 99]
<
There are also several readonly attributes available:
format~
A string containing the format (in struct (|py2stdlib-struct|) module style) for each
element in the view. This defaults to ``'B'``, a simple bytestring.
itemsize~
The size in bytes of each element of the memoryview.
shape~
A tuple of integers the length of ndim giving the shape of the
memory as a N-dimensional array.
ndim~
An integer indicating how many dimensions of a multi-dimensional array the
memory represents.
strides~
A tuple of integers the length of ndim giving the size in bytes to
access each element for each dimension of the array.
.. memoryview.suboffsets isn't documented because it only seems useful for C
Context Manager Types
=====================
.. versionadded:: 2.5
.. index::
single: context manager
single: context management protocol
single: protocol; context management
Python's with statement supports the concept of a runtime context
defined by a context manager. This is implemented using two separate methods
that allow user-defined classes to define a runtime context that is entered
before the statement body is executed and exited when the statement ends.
The context management protocol consists of a pair of methods that need
to be provided for a context manager object to define a runtime context:
contextmanager.__enter__()~
Enter the runtime context and return either this object or another object
related to the runtime context. The value returned by this method is bound to
the identifier in the as clause of with statements using
this context manager.
An example of a context manager that returns itself is a file object. File
objects return themselves from __enter__() to allow open to be used as
the context expression in a with statement.
An example of a context manager that returns a related object is the one
returned by decimal.localcontext. These managers set the active
decimal context to a copy of the original decimal context and then return the
copy. This allows changes to be made to the current decimal context in the body
of the with statement without affecting code outside the
with statement.
contextmanager.__exit__(exc_type, exc_val, exc_tb)~
Exit the runtime context and return a Boolean flag indicating if any exception
that occurred should be suppressed. If an exception occurred while executing the
body of the with statement, the arguments contain the exception type,
value and traceback information. Otherwise, all three arguments are ``None``.
Returning a true value from this method will cause the with statement
to suppress the exception and continue execution with the statement immediately
following the with statement. Otherwise the exception continues
propagating after this method has finished executing. Exceptions that occur
during execution of this method will replace any exception that occurred in the
body of the with statement.
The exception passed in should never be reraised explicitly - instead, this
method should return a false value to indicate that the method completed
successfully and does not want to suppress the raised exception. This allows
context management code (such as ``contextlib.nested``) to easily detect whether
or not an __exit__ method has actually failed.
Python defines several context managers to support easy thread synchronisation,
prompt closure of files or other objects, and simpler manipulation of the active
decimal arithmetic context. The specific types are not treated specially beyond
their implementation of the context management protocol. See the
contextlib (|py2stdlib-contextlib|) module for some examples.
Python's generator\s and the ``contextlib.contextmanager`` decorator
provide a convenient way to implement these protocols. If a generator function is
decorated with the ``contextlib.contextmanager`` decorator, it will return a
context manager implementing the necessary __enter__ and
__exit__ methods, rather than the iterator produced by an undecorated
generator function.
Note that there is no specific slot for any of these methods in the type
structure for Python objects in the Python/C API. Extension types wanting to
define these methods must provide them as a normal Python accessible method.
Compared to the overhead of setting up the runtime context, the overhead of a
single class dictionary lookup is negligible.
Other Built-in Types
====================
The interpreter supports several other kinds of objects. Most of these support
only one or two operations.
Modules
-------
The only special operation on a module is attribute access: ``m.name``, where
{m} is a module and {name} accesses a name defined in {m}'s symbol table.
Module attributes can be assigned to. (Note that the import
statement is not, strictly speaking, an operation on a module object; ``import
foo`` does not require a module object named {foo} to exist, rather it requires
an (external) {definition} for a module named {foo} somewhere.)
A special member of every module is __dict__. This is the dictionary
containing the module's symbol table. Modifying this dictionary will actually
change the module's symbol table, but direct assignment to the __dict__
attribute is not possible (you can write ``m.__dict__['a'] = 1``, which defines
``m.a`` to be ``1``, but you can't write ``m.__dict__ = {}``). Modifying
__dict__ directly is not recommended.
Modules built into the interpreter are written like this: ``<module 'sys'
(built-in)>``. If loaded from a file, they are written as ``<module 'os' from
'/usr/local/lib/pythonX.Y/os.pyc'>``.
Classes and Class Instances
---------------------------
See objects and class for these.
Functions
---------
Function objects are created by function definitions. The only operation on a
function object is to call it: ``func(argument-list)``.
There are really two flavors of function objects: built-in functions and
user-defined functions. Both support the same operation (to call the function),
but the implementation is different, hence the different object types.
See function for more information.
Methods
-------
.. index:: object: method
Methods are functions that are called using the attribute notation. There are
two flavors: built-in methods (such as append on lists) and class
instance methods. Built-in methods are described with the types that support
them.
The implementation adds two special read-only attributes to class instance
methods: ``m.im_self`` is the object on which the method operates, and
``m.im_func`` is the function implementing the method. Calling ``m(arg-1,
arg-2, ..., arg-n)`` is completely equivalent to calling ``m.im_func(m.im_self,
arg-1, arg-2, ..., arg-n)``.
Class instance methods are either {bound} or {unbound}, referring to whether the
method was accessed through an instance or a class, respectively. When a method
is unbound, its ``im_self`` attribute will be ``None`` and if called, an
explicit ``self`` object must be passed as the first argument. In this case,
``self`` must be an instance of the unbound method's class (or a subclass of
that class), otherwise a TypeError is raised.
Like function objects, methods objects support getting arbitrary attributes.
However, since method attributes are actually stored on the underlying function
object (``meth.im_func``), setting method attributes on either bound or unbound
methods is disallowed. Attempting to set a method attribute results in a
TypeError being raised. In order to set a method attribute, you need to
explicitly set it on the underlying function object:: >
class C:
def method(self):
pass
c = C()
c.method.im_func.whoami = 'my name is c'
<
See types (|py2stdlib-types|) for more information.
Code Objects
------------
.. index:: object: code
.. index::
builtin: compile
single: func_code (function object attribute)
Code objects are used by the implementation to represent "pseudo-compiled"
executable Python code such as a function body. They differ from function
objects because they don't contain a reference to their global execution
environment. Code objects are returned by the built-in compile function
and can be extracted from function objects through their func_code
attribute. See also the code (|py2stdlib-code|) module.
.. index::
statement: exec
builtin: eval
A code object can be executed or evaluated by passing it (instead of a source
string) to the exec statement or the built-in eval function.
See types (|py2stdlib-types|) for more information.
Type Objects
------------
.. index::
builtin: type
module: types
Type objects represent the various object types. An object's type is accessed
by the built-in function type. There are no special operations on
types. The standard module types (|py2stdlib-types|) defines names for all standard built-in
types.
Types are written like this: ``<type 'int'>``.
The Null Object
---------------
This object is returned by functions that don't explicitly return a value. It
supports no special operations. There is exactly one null object, named
``None`` (a built-in name).
It is written as ``None``.
The Ellipsis Object
-------------------
This object is used by extended slice notation (see slicings). It
supports no special operations. There is exactly one ellipsis object, named
Ellipsis (a built-in name).
It is written as ``Ellipsis``.
Boolean Values
--------------
Boolean values are the two constant objects ``False`` and ``True``. They are
used to represent truth values (although other values can also be considered
false or true). In numeric contexts (for example when used as the argument to
an arithmetic operator), they behave like the integers 0 and 1, respectively.
The built-in function bool can be used to cast any value to a Boolean,
if the value can be interpreted as a truth value (see section Truth Value
Testing above).
.. index::
single: False
single: True
pair: Boolean; values
They are written as ``False`` and ``True``, respectively.
Internal Objects
----------------
See types (|py2stdlib-types|) for this information. It describes stack frame objects,
traceback objects, and slice objects.
Special Attributes
==================
The implementation adds a few special read-only attributes to several object
types, where they are relevant. Some of these are not reported by the
dir built-in function.
object.__dict__~
A dictionary or other mapping object used to store an object's (writable)
attributes.
object.__methods__~
2.2~
Use the built-in function dir to get a list of an object's attributes.
This attribute is no longer available.
object.__members__~
2.2~
Use the built-in function dir to get a list of an object's attributes.
This attribute is no longer available.
instance.__class__~
The class to which a class instance belongs.
class.__bases__~
The tuple of base classes of a class object.
class.__name__~
The name of the class or type.
The following attributes are only supported by new-style class\ es.
class.__mro__~
This attribute is a tuple of classes that are considered when looking for
base classes during method resolution.
class.mro()~
This method can be overridden by a metaclass to customize the method
resolution order for its instances. It is called at class instantiation, and
its result is stored in __mro__.
class.__subclasses__~
Each new-style class keeps a list of weak references to its immediate
subclasses. This method returns a list of all those references still alive.
Example:: >
>>> int.__subclasses__()
[<type 'bool'>]
<
.. rubric:: Footnotes
.. [#] Additional information on these special methods may be found in the Python
Reference Manual (customization).
.. [#] As a consequence, the list ``[1, 2]`` is considered equal to ``[1.0, 2.0]``, and
similarly for tuples.
.. [#] They must have since the parser can't tell the type of the operands.
.. [#] To format only a tuple you should therefore provide a singleton tuple whose only
element is the tuple to be formatted.
.. [#] The advantage of leaving the newline on is that returning an empty string is
then an unambiguous EOF indication. It is also possible (in cases where it
might matter, for example, if you want to make an exact copy of a file while
scanning its lines) to tell whether the last line of a file ended in a newline
or not (yes this happens!).
*py2stdlib-builtin:Exceptions*
Exceptions~
Built-in Exceptions
===================
==============================================================================
*py2stdlib-__future__*
__future__~
:synopsis: Future statement definitions
__future__ (|py2stdlib-__future__|) is a real module, and serves three purposes:
* To avoid confusing existing tools that analyze import statements and expect to
find the modules they're importing.
* To ensure that future statements <future> run under releases prior to
2.1 at least yield runtime exceptions (the import of __future__ (|py2stdlib-__future__|) will
fail, because there was no module of that name prior to 2.1).
* To document when incompatible changes were introduced, and when they will be
--- or were --- made mandatory. This is a form of executable documentation, and
can be inspected programmatically via importing __future__ (|py2stdlib-__future__|) and examining
its contents.
Each statement in __future__.py is of the form:: >
FeatureName = _Feature(OptionalRelease, MandatoryRelease,
CompilerFlag)
<
where, normally, {OptionalRelease} is less than {MandatoryRelease}, and both are
5-tuples of the same form as ``sys.version_info``:: >
(PY_MAJOR_VERSION, # the 2 in 2.1.0a3; an int
PY_MINOR_VERSION, # the 1; an int
PY_MICRO_VERSION, # the 0; an int
PY_RELEASE_LEVEL, # "alpha", "beta", "candidate" or "final"; string
PY_RELEASE_SERIAL # the 3; an int
)
<
{OptionalRelease} records the first release in which the feature was accepted.
In the case of a {MandatoryRelease} that has not yet occurred,
{MandatoryRelease} predicts the release in which the feature will become part of
the language.
Else {MandatoryRelease} records when the feature became part of the language; in
releases at or after that, modules no longer need a future statement to use the
feature in question, but may continue to use such imports.
{MandatoryRelease} may also be ``None``, meaning that a planned feature got
dropped.
Instances of class _Feature have two corresponding methods,
getOptionalRelease and getMandatoryRelease.
{CompilerFlag} is the (bitfield) flag that should be passed in the fourth
argument to the built-in function compile to enable the feature in
dynamically compiled code. This flag is stored in the compiler_flag
attribute on _Feature instances.
No feature description will ever be deleted from __future__ (|py2stdlib-__future__|). Since its
introduction in Python 2.1 the following features have found their way into the
language using this mechanism:
+------------------+-------------+--------------+---------------------------------------------+
| feature | optional in | mandatory in | effect |
+==================+=============+==============+=============================================+
| nested_scopes | 2.1.0b1 | 2.2 | 227: |
| | | | {Statically Nested Scopes} |
+------------------+-------------+--------------+---------------------------------------------+
| generators | 2.2.0a1 | 2.3 | 255: |
| | | | {Simple Generators} |
+------------------+-------------+--------------+---------------------------------------------+
| division | 2.2.0a2 | 3.0 | 238: |
| | | | {Changing the Division Operator} |
+------------------+-------------+--------------+---------------------------------------------+
| absolute_import | 2.5.0a1 | 2.7 | 328: |
| | | | {Imports: Multi-Line and Absolute/Relative} |
+------------------+-------------+--------------+---------------------------------------------+
| with_statement | 2.5.0a1 | 2.6 | 343: |
| | | | {The "with" Statement} |
+------------------+-------------+--------------+---------------------------------------------+
| print_function | 2.6.0a2 | 3.0 | 3105: |
| | | | {Make print a function} |
+------------------+-------------+--------------+---------------------------------------------+
| unicode_literals | 2.6.0a2 | 3.0 | 3112: |
| | | | {Bytes literals in Python 3000} |
+------------------+-------------+--------------+---------------------------------------------+
.. seealso::
future
How the compiler treats future imports.
==============================================================================
*py2stdlib-__main__*
__main__~
:synopsis: The environment where the top-level script is run.
This module represents the (otherwise anonymous) scope in which the
interpreter's main program executes --- commands read either from standard
input, from a script file, or from an interactive prompt. It is this
environment in which the idiomatic "conditional script" stanza causes a script
to run:: >
if __name__ == "__main__":
main()
==============================================================================
*py2stdlib-_winreg*
_winreg~
:platform: Windows
:synopsis: Routines and objects for manipulating the Windows registry.
.. note::
The _winreg (|py2stdlib-_winreg|) module has been renamed to winreg in Python 3.0.
The 2to3 tool will automatically adapt imports when converting your
sources to 3.0.
.. versionadded:: 2.0
These functions expose the Windows registry API to Python. Instead of using an
integer as the registry handle, a handle object <handle-object> is used
to ensure that the handles are closed correctly, even if the programmer neglects
to explicitly close them.
This module offers the following functions:
CloseKey(hkey)~
Closes a previously opened registry key. The {hkey} argument specifies a
previously opened key.
.. note::
If {hkey} is not closed using this method (or via hkey.Close() <PyHKEY.Close>),
it is closed when the {hkey} object is destroyed by Python.
ConnectRegistry(computer_name, key)~
Establishes a connection to a predefined registry handle on another computer,
and returns a handle object <handle-object>.
{computer_name} is the name of the remote computer, of the form
``r"\\computername"``. If ``None``, the local computer is used.
{key} is the predefined handle to connect to.
The return value is the handle of the opened key. If the function fails, a
WindowsError exception is raised.
CreateKey(key, sub_key)~
Creates or opens the specified key, returning a
handle object <handle-object>.
{key} is an already open key, or one of the predefined
HKEY_* constants <hkey-constants>.
{sub_key} is a string that names the key this method opens or creates.
If {key} is one of the predefined keys, {sub_key} may be ``None``. In that
case, the handle returned is the same key handle passed in to the function.
If the key already exists, this function opens the existing key.
The return value is the handle of the opened key. If the function fails, a
WindowsError exception is raised.
CreateKeyEx(key, sub_key[, res[, sam]])~
Creates or opens the specified key, returning a
handle object <handle-object>.
{key} is an already open key, or one of the predefined
HKEY_* constants <hkey-constants>.
{sub_key} is a string that names the key this method opens or creates.
{res} is a reserved integer, and must be zero. The default is zero.
{sam} is an integer that specifies an access mask that describes the desired
security access for the key. Default is KEY_ALL_ACCESS. See
Access Rights <access-rights> for other allowed values.
If {key} is one of the predefined keys, {sub_key} may be ``None``. In that
case, the handle returned is the same key handle passed in to the function.
If the key already exists, this function opens the existing key.
The return value is the handle of the opened key. If the function fails, a
WindowsError exception is raised.
.. versionadded:: 2.7
DeleteKey(key, sub_key)~
Deletes the specified key.
{key} is an already open key, or any one of the predefined
HKEY_* constants <hkey-constants>.
{sub_key} is a string that must be a subkey of the key identified by the {key}
parameter. This value must not be ``None``, and the key may not have subkeys.
{This method can not delete keys with subkeys.}
If the method succeeds, the entire key, including all of its values, is removed.
If the method fails, a WindowsError exception is raised.
DeleteKeyEx(key, sub_key[, sam[, res]])~
Deletes the specified key.
.. note::
The DeleteKeyEx function is implemented with the RegDeleteKeyEx
Windows API function, which is specific to 64-bit versions of Windows.
See the `RegDeleteKeyEx documentation
<http://msdn.microsoft.com/en-us/library/ms724847%28VS.85%29.aspx>`__.
{key} is an already open key, or any one of the predefined
HKEY_* constants <hkey-constants>.
{sub_key} is a string that must be a subkey of the key identified by the
{key} parameter. This value must not be ``None``, and the key may not have
subkeys.
{res} is a reserved integer, and must be zero. The default is zero.
{sam} is an integer that specifies an access mask that describes the desired
security access for the key. Default is KEY_WOW64_64KEY. See
Access Rights <access-rights> for other allowed values.
{This method can not delete keys with subkeys.}
If the method succeeds, the entire key, including all of its values, is
removed. If the method fails, a WindowsError exception is raised.
On unsupported Windows versions, NotImplementedError is raised.
.. versionadded:: 2.7
DeleteValue(key, value)~
Removes a named value from a registry key.
{key} is an already open key, or one of the predefined
HKEY_* constants <hkey-constants>.
{value} is a string that identifies the value to remove.
EnumKey(key, index)~
Enumerates subkeys of an open registry key, returning a string.
{key} is an already open key, or any one of the predefined
HKEY_* constants <hkey-constants>.
{index} is an integer that identifies the index of the key to retrieve.
The function retrieves the name of one subkey each time it is called. It is
typically called repeatedly until a WindowsError exception is
raised, indicating, no more values are available.
EnumValue(key, index)~
Enumerates values of an open registry key, returning a tuple.
{key} is an already open key, or any one of the predefined
HKEY_* constants <hkey-constants>.
{index} is an integer that identifies the index of the value to retrieve.
The function retrieves the name of one subkey each time it is called. It is
typically called repeatedly, until a WindowsError exception is
raised, indicating no more values.
The result is a tuple of 3 items:
+-------+--------------------------------------------+
| Index | Meaning |
+=======+============================================+
| ``0`` | A string that identifies the value name |
+-------+--------------------------------------------+
| ``1`` | An object that holds the value data, and |
| | whose type depends on the underlying |
| | registry type |
+-------+--------------------------------------------+
| ``2`` | An integer that identifies the type of the |
| | value data (see table in docs for |
| | SetValueEx) |
+-------+--------------------------------------------+
ExpandEnvironmentStrings(unicode)~
Expands environment variable placeholders ``%NAME%`` in unicode strings like
REG_EXPAND_SZ:: >
>>> ExpandEnvironmentStrings(u"%windir%")
u"C:\\Windows"
<
.. versionadded:: 2.6
FlushKey(key)~
Writes all the attributes of a key to the registry.
{key} is an already open key, or one of the predefined
HKEY_* constants <hkey-constants>.
It is not necessary to call FlushKey to change a key. Registry changes are
flushed to disk by the registry using its lazy flusher. Registry changes are
also flushed to disk at system shutdown. Unlike CloseKey, the
FlushKey method returns only when all the data has been written to the
registry. An application should only call FlushKey if it requires
absolute certainty that registry changes are on disk.
.. note:: >
If you don't know whether a FlushKey call is required, it probably
isn't.
<
LoadKey(key, sub_key, file_name)~
Creates a subkey under the specified key and stores registration information
from a specified file into that subkey.
{key} is a handle returned by ConnectRegistry or one of the constants
HKEY_USERS or HKEY_LOCAL_MACHINE.
{sub_key} is a string that identifies the subkey to load.
{file_name} is the name of the file to load registry data from. This file must
have been created with the SaveKey function. Under the file allocation
table (FAT) file system, the filename may not have an extension.
A call to LoadKey fails if the calling process does not have the
SE_RESTORE_PRIVILEGE privilege. Note that privileges are different
from permissions -- see the `RegLoadKey documentation
<http://msdn.microsoft.com/en-us/library/ms724889%28v=VS.85%29.aspx>`__ for
more details.
If {key} is a handle returned by ConnectRegistry, then the path
specified in {file_name} is relative to the remote computer.
OpenKey(key, sub_key[, res[, sam]])~
Opens the specified key, returning a handle object <handle-object>.
{key} is an already open key, or any one of the predefined
HKEY_* constants <hkey-constants>.
{sub_key} is a string that identifies the sub_key to open.
{res} is a reserved integer, and must be zero. The default is zero.
{sam} is an integer that specifies an access mask that describes the desired
security access for the key. Default is KEY_READ. See
Access Rights <access-rights> for other allowed values.
The result is a new handle to the specified key.
If the function fails, WindowsError is raised.
OpenKeyEx()~
The functionality of OpenKeyEx is provided via OpenKey,
by the use of default arguments.
QueryInfoKey(key)~
Returns information about a key, as a tuple.
{key} is an already open key, or one of the predefined
HKEY_* constants <hkey-constants>.
The result is a tuple of 3 items:
+-------+---------------------------------------------+
| Index | Meaning |
+=======+=============================================+
| ``0`` | An integer giving the number of sub keys |
| | this key has. |
+-------+---------------------------------------------+
| ``1`` | An integer giving the number of values this |
| | key has. |
+-------+---------------------------------------------+
| ``2`` | A long integer giving when the key was last |
| | modified (if available) as 100's of |
| | nanoseconds since Jan 1, 1600. |
+-------+---------------------------------------------+
QueryValue(key, sub_key)~
Retrieves the unnamed value for a key, as a string.
{key} is an already open key, or one of the predefined
HKEY_* constants <hkey-constants>.
{sub_key} is a string that holds the name of the subkey with which the value is
associated. If this parameter is ``None`` or empty, the function retrieves the
value set by the SetValue method for the key identified by {key}.
Values in the registry have name, type, and data components. This method
retrieves the data for a key's first value that has a NULL name. But the
underlying API call doesn't return the type, so always use
QueryValueEx if possible.
QueryValueEx(key, value_name)~
Retrieves the type and data for a specified value name associated with
an open registry key.
{key} is an already open key, or one of the predefined
HKEY_* constants <hkey-constants>.
{value_name} is a string indicating the value to query.
The result is a tuple of 2 items:
+-------+-----------------------------------------+
| Index | Meaning |
+=======+=========================================+
| ``0`` | The value of the registry item. |
+-------+-----------------------------------------+
| ``1`` | An integer giving the registry type for |
| | this value (see table in docs for |
| | SetValueEx) |
+-------+-----------------------------------------+
SaveKey(key, file_name)~
Saves the specified key, and all its subkeys to the specified file.
{key} is an already open key, or one of the predefined
HKEY_* constants <hkey-constants>.
{file_name} is the name of the file to save registry data to. This file
cannot already exist. If this filename includes an extension, it cannot be
used on file allocation table (FAT) file systems by the LoadKey
method.
If {key} represents a key on a remote computer, the path described by
{file_name} is relative to the remote computer. The caller of this method must
possess the SeBackupPrivilege security privilege. Note that
privileges are different than permissions -- see the
`Conflicts Between User Rights and Permissions documentation
<http://msdn.microsoft.com/en-us/library/ms724878%28v=VS.85%29.aspx>`__
for more details.
This function passes NULL for {security_attributes} to the API.
SetValue(key, sub_key, type, value)~
Associates a value with a specified key.
{key} is an already open key, or one of the predefined
HKEY_* constants <hkey-constants>.
{sub_key} is a string that names the subkey with which the value is associated.
{type} is an integer that specifies the type of the data. Currently this must be
REG_SZ, meaning only strings are supported. Use the SetValueEx
function for support for other data types.
{value} is a string that specifies the new value.
If the key specified by the {sub_key} parameter does not exist, the SetValue
function creates it.
Value lengths are limited by available memory. Long values (more than 2048
bytes) should be stored as files with the filenames stored in the configuration
registry. This helps the registry perform efficiently.
The key identified by the {key} parameter must have been opened with
KEY_SET_VALUE access.
SetValueEx(key, value_name, reserved, type, value)~
Stores data in the value field of an open registry key.
{key} is an already open key, or one of the predefined
HKEY_* constants <hkey-constants>.
{value_name} is a string that names the subkey with which the value is
associated.
{type} is an integer that specifies the type of the data. See
Value Types <value-types> for the available types.
{reserved} can be anything -- zero is always passed to the API.
{value} is a string that specifies the new value.
This method can also set additional value and type information for the specified
key. The key identified by the key parameter must have been opened with
KEY_SET_VALUE access.
To open the key, use the CreateKey or OpenKey methods.
Value lengths are limited by available memory. Long values (more than 2048
bytes) should be stored as files with the filenames stored in the configuration
registry. This helps the registry perform efficiently.
DisableReflectionKey(key)~
Disables registry reflection for 32-bit processes running on a 64-bit
operating system.
{key} is an already open key, or one of the predefined
HKEY_* constants <hkey-constants>.
Will generally raise NotImplemented if executed on a 32-bit
operating system.
If the key is not on the reflection list, the function succeeds but has no
effect. Disabling reflection for a key does not affect reflection of any
subkeys.
EnableReflectionKey(key)~
Restores registry reflection for the specified disabled key.
{key} is an already open key, or one of the predefined
HKEY_* constants <hkey-constants>.
Will generally raise NotImplemented if executed on a 32-bit
operating system.
Restoring reflection for a key does not affect reflection of any subkeys.
QueryReflectionKey(key)~
Determines the reflection state for the specified key.
{key} is an already open key, or one of the predefined
HKEY_* constants <hkey-constants>.
Returns ``True`` if reflection is disabled.
Will generally raise NotImplemented if executed on a 32-bit
operating system.
Constants
---------
The following constants are defined for use in many _winreg (|py2stdlib-_winreg|) functions.
HKEY_* Constants
++++++++++++++++
HKEY_CLASSES_ROOT~
Registry entries subordinate to this key define types (or classes) of
documents and the properties associated with those types. Shell and
COM applications use the information stored under this key.
HKEY_CURRENT_USER~
Registry entries subordinate to this key define the preferences of
the current user. These preferences include the settings of
environment variables, data about program groups, colors, printers,
network connections, and application preferences.
HKEY_LOCAL_MACHINE~
Registry entries subordinate to this key define the physical state
of the computer, including data about the bus type, system memory,
and installed hardware and software.
HKEY_USERS~
Registry entries subordinate to this key define the default user
configuration for new users on the local computer and the user
configuration for the current user.
HKEY_PERFORMANCE_DATA~
Registry entries subordinate to this key allow you to access
performance data. The data is not actually stored in the registry;
the registry functions cause the system to collect the data from
its source.
HKEY_CURRENT_CONFIG~
Contains information about the current hardware profile of the
local computer system.
HKEY_DYN_DATA~
This key is not used in versions of Windows after 98.
Access Rights
+++++++++++++
For more information, see `Registry Key Security and Access
<http://msdn.microsoft.com/en-us/library/ms724878%28v=VS.85%29.aspx>`__.
KEY_ALL_ACCESS~
Combines the STANDARD_RIGHTS_REQUIRED, KEY_QUERY_VALUE,
KEY_SET_VALUE, KEY_CREATE_SUB_KEY,
KEY_ENUMERATE_SUB_KEYS, KEY_NOTIFY,
and KEY_CREATE_LINK access rights.
KEY_WRITE~
Combines the STANDARD_RIGHTS_WRITE, KEY_SET_VALUE, and
KEY_CREATE_SUB_KEY access rights.
KEY_READ~
Combines the STANDARD_RIGHTS_READ, KEY_QUERY_VALUE,
KEY_ENUMERATE_SUB_KEYS, and KEY_NOTIFY values.
KEY_EXECUTE~
Equivalent to KEY_READ.
KEY_QUERY_VALUE~
Required to query the values of a registry key.
KEY_SET_VALUE~
Required to create, delete, or set a registry value.
KEY_CREATE_SUB_KEY~
Required to create a subkey of a registry key.
KEY_ENUMERATE_SUB_KEYS~
Required to enumerate the subkeys of a registry key.
KEY_NOTIFY~
Required to request change notifications for a registry key or for
subkeys of a registry key.
KEY_CREATE_LINK~
Reserved for system use.
64-bit Specific
***************
{}
For more information, see `Accesing an Alternate Registry View
<http://msdn.microsoft.com/en-us/library/aa384129(v=VS.85).aspx>`__.
KEY_WOW64_64KEY~
Indicates that an application on 64-bit Windows should operate on
the 64-bit registry view.
KEY_WOW64_32KEY~
Indicates that an application on 64-bit Windows should operate on
the 32-bit registry view.
Value Types
+++++++++++
For more information, see `Registry Value Types
<http://msdn.microsoft.com/en-us/library/ms724884%28v=VS.85%29.aspx>`__.
REG_BINARY~
Binary data in any form.
REG_DWORD~
32-bit number.
REG_DWORD_LITTLE_ENDIAN~
A 32-bit number in little-endian format.
REG_DWORD_BIG_ENDIAN~
A 32-bit number in big-endian format.
REG_EXPAND_SZ~
Null-terminated string containing references to environment
variables (``%PATH%``).
REG_LINK~
A Unicode symbolic link.
REG_MULTI_SZ~
A sequence of null-terminated strings, terminated by two null characters.
(Python handles this termination automatically.)
REG_NONE~
No defined value type.
REG_RESOURCE_LIST~
A device-driver resource list.
REG_FULL_RESOURCE_DESCRIPTOR~
A hardware setting.
REG_RESOURCE_REQUIREMENTS_LIST~
A hardware resource list.
REG_SZ~
A null-terminated string.
Registry Handle Objects
-----------------------
This object wraps a Windows HKEY object, automatically closing it when the
object is destroyed. To guarantee cleanup, you can call either the
PyHKEY.Close method on the object, or the CloseKey function.
All registry functions in this module return one of these objects.
All registry functions in this module which accept a handle object also accept
an integer, however, use of the handle object is encouraged.
Handle objects provide semantics for __nonzero__ -- thus:: >
if handle:
print "Yes"
<
will print ``Yes`` if the handle is currently valid (has not been closed or
detached).
The object also support comparison semantics, so handle objects will compare
true if they both reference the same underlying Windows handle value.
Handle objects can be converted to an integer (e.g., using the built-in
int function), in which case the underlying Windows handle value is
returned. You can also use the PyHKEY.Detach method to return the
integer handle, and also disconnect the Windows handle from the handle object.
PyHKEY.Close()~
Closes the underlying Windows handle.
If the handle is already closed, no error is raised.
PyHKEY.Detach()~
Detaches the Windows handle from the handle object.
The result is an integer (or long on 64 bit Windows) that holds the value of the
handle before it is detached. If the handle is already detached or closed, this
will return zero.
After calling this function, the handle is effectively invalidated, but the
handle is not closed. You would call this function when you need the
underlying Win32 handle to exist beyond the lifetime of the handle object.
PyHKEY.__enter__()~
PyHKEY.__exit__(\*exc_info)
The HKEY object implements object.__enter__ and
object.__exit__ and thus supports the context protocol for the
with statement:: >
with OpenKey(HKEY_LOCAL_MACHINE, "foo") as key:
... # work with key
<
will automatically close {key} when control leaves the with block.
.. versionadded:: 2.6
==============================================================================
*py2stdlib-abc*
abc~
:synopsis: Abstract base classes according to PEP 3119.
.. much of the content adapted from docstrings
.. versionadded:: 2.6
This module provides the infrastructure for defining an :term:`abstract base
class` (ABCs) in Python, as outlined in 3119; see the PEP for why this
was added to Python. (See also 3141 and the numbers (|py2stdlib-numbers|) module
regarding a type hierarchy for numbers based on ABCs.)
The collections (|py2stdlib-collections|) module has some concrete classes that derive from
ABCs; these can, of course, be further derived. In addition the
collections (|py2stdlib-collections|) module has some ABCs that can be used to test whether
a class or instance provides a particular interface, for example, is it
hashable or a mapping.
This module provides the following class:
ABCMeta~
Metaclass for defining Abstract Base Classes (ABCs).
Use this metaclass to create an ABC. An ABC can be subclassed directly, and
then acts as a mix-in class. You can also register unrelated concrete
classes (even built-in classes) and unrelated ABCs as "virtual subclasses" --
these and their descendants will be considered subclasses of the registering
ABC by the built-in issubclass function, but the registering ABC
won't show up in their MRO (Method Resolution Order) nor will method
implementations defined by the registering ABC be callable (not even via
super). [#]_
Classes created with a metaclass of ABCMeta have the following method:
register(subclass)~
Register {subclass} as a "virtual subclass" of this ABC. For
example:: >
from abc import ABCMeta
class MyABC:
__metaclass__ = ABCMeta
MyABC.register(tuple)
assert issubclass(tuple, MyABC)
assert isinstance((), MyABC)
<
You can also override this method in an abstract base class:
__subclasshook__(subclass)~
(Must be defined as a class method.)
Check whether {subclass} is considered a subclass of this ABC. This means
that you can customize the behavior of ``issubclass`` further without the
need to call register on every class you want to consider a
subclass of the ABC. (This class method is called from the
__subclasscheck__ method of the ABC.)
This method should return ``True``, ``False`` or ``NotImplemented``. If
it returns ``True``, the {subclass} is considered a subclass of this ABC.
If it returns ``False``, the {subclass} is not considered a subclass of
this ABC, even if it would normally be one. If it returns
``NotImplemented``, the subclass check is continued with the usual
mechanism.
.. XXX explain the "usual mechanism"
For a demonstration of these concepts, look at this example ABC definition:: >
class Foo(object):
def __getitem__(self, index):
...
def __len__(self):
...
def get_iterator(self):
return iter(self)
class MyIterable:
__metaclass__ = ABCMeta
@abstractmethod
def __iter__(self):
while False:
yield None
def get_iterator(self):
return self.__iter__()
@classmethod
def __subclasshook__(cls, C):
if cls is MyIterable:
if any("__iter__" in B.__dict__ for B in C.__mro__):
return True
return NotImplemented
MyIterable.register(Foo)
<
The ABC ``MyIterable`` defines the standard iterable method,
__iter__, as an abstract method. The implementation given here can
still be called from subclasses. The get_iterator method is also
part of the ``MyIterable`` abstract base class, but it does not have to be
overridden in non-abstract derived classes.
The __subclasshook__ class method defined here says that any class
that has an __iter__ method in its __dict__ (or in that of
one of its base classes, accessed via the __mro__ list) is
considered a ``MyIterable`` too.
Finally, the last line makes ``Foo`` a virtual subclass of ``MyIterable``,
even though it does not define an __iter__ method (it uses the
old-style iterable protocol, defined in terms of __len__ and
__getitem__). Note that this will not make ``get_iterator``
available as a method of ``Foo``, so it is provided separately.
It also provides the following decorators:
abstractmethod(function)~
A decorator indicating abstract methods.
Using this decorator requires that the class's metaclass is ABCMeta or
is derived from it.
A class that has a metaclass derived from ABCMeta
cannot be instantiated unless all of its abstract methods and
properties are overridden.
The abstract methods can be called using any of the normal 'super' call
mechanisms.
Dynamically adding abstract methods to a class, or attempting to modify the
abstraction status of a method or class once it is created, are not
supported. The abstractmethod only affects subclasses derived using
regular inheritance; "virtual subclasses" registered with the ABC's
register method are not affected.
Usage:: >
class C:
__metaclass__ = ABCMeta
@abstractmethod
def my_abstract_method(self, ...):
...
<
.. note::
Unlike Java abstract methods, these abstract
methods may have an implementation. This implementation can be
called via the super mechanism from the class that
overrides it. This could be useful as an end-point for a
super-call in a framework that uses cooperative
multiple-inheritance.
abstractproperty([fget[, fset[, fdel[, doc]]]])~
A subclass of the built-in property, indicating an abstract property.
Using this function requires that the class's metaclass is ABCMeta or
is derived from it.
A class that has a metaclass derived from ABCMeta cannot be
instantiated unless all of its abstract methods and properties are overridden.
The abstract properties can be called using any of the normal
'super' call mechanisms.
Usage:: >
class C:
__metaclass__ = ABCMeta
@abstractproperty
def my_abstract_property(self):
...
<
This defines a read-only property; you can also define a read-write abstract
property using the 'long' form of property declaration:: >
class C:
__metaclass__ = ABCMeta
def getx(self): ...
def setx(self, value): ...
x = abstractproperty(getx, setx)
<
.. rubric:: Footnotes
.. [#] C++ programmers should note that Python's virtual base class
concept is not the same as C++'s.
==============================================================================
*py2stdlib-aepack*
aepack~
:platform: Mac
:synopsis: Conversion between Python variables and AppleEvent data containers.
:deprecated:
The aepack (|py2stdlib-aepack|) module defines functions for converting (packing) Python
variables to AppleEvent descriptors and back (unpacking). Within Python the
AppleEvent descriptor is handled by Python objects of built-in type
AEDesc, defined in module Carbon.AE (|py2stdlib-carbon.ae|).
.. note::
This module has been removed in Python 3.x.
The aepack (|py2stdlib-aepack|) module defines the following functions:
pack(x[, forcetype])~
Returns an AEDesc object containing a conversion of Python value x. If
{forcetype} is provided it specifies the descriptor type of the result.
Otherwise, a default mapping of Python types to Apple Event descriptor types is
used, as follows:
+-----------------+-----------------------------------+
| Python type | descriptor type |
+=================+===================================+
| FSSpec | typeFSS |
+-----------------+-----------------------------------+
| FSRef | typeFSRef |
+-----------------+-----------------------------------+
| Alias | typeAlias |
+-----------------+-----------------------------------+
| integer | typeLong (32 bit integer) |
+-----------------+-----------------------------------+
| float | typeFloat (64 bit floating point) |
+-----------------+-----------------------------------+
| string | typeText |
+-----------------+-----------------------------------+
| unicode | typeUnicodeText |
+-----------------+-----------------------------------+
| list | typeAEList |
+-----------------+-----------------------------------+
| dictionary | typeAERecord |
+-----------------+-----------------------------------+
| instance | {see below} |
+-----------------+-----------------------------------+
If {x} is a Python instance then this function attempts to call an
__aepack__ method. This method should return an AEDesc object.
If the conversion {x} is not defined above, this function returns the Python
string representation of a value (the repr() function) encoded as a text
descriptor.
unpack(x[, formodulename])~
{x} must be an object of type AEDesc. This function returns a Python
object representation of the data in the Apple Event descriptor {x}. Simple
AppleEvent data types (integer, text, float) are returned as their obvious
Python counterparts. Apple Event lists are returned as Python lists, and the
list elements are recursively unpacked. Object references (ex. ``line 3 of
document 1``) are returned as instances of aetypes.ObjectSpecifier,
unless ``formodulename`` is specified. AppleEvent descriptors with descriptor
type typeFSS are returned as FSSpec objects. AppleEvent record
descriptors are returned as Python dictionaries, with 4-character string keys
and elements recursively unpacked.
The optional ``formodulename`` argument is used by the stub packages generated
by gensuitemodule (|py2stdlib-gensuitemodule|), and ensures that the OSA classes for object specifiers
are looked up in the correct module. This ensures that if, say, the Finder
returns an object specifier for a window you get an instance of
``Finder.Window`` and not a generic ``aetypes.Window``. The former knows about
all the properties and elements a window has in the Finder, while the latter
knows no such things.
.. seealso::
Module Carbon.AE (|py2stdlib-carbon.ae|)
Built-in access to Apple Event Manager routines.
Module aetypes (|py2stdlib-aetypes|)
Python definitions of codes for Apple Event descriptor types.
==============================================================================
*py2stdlib-aetools*
aetools~
:platform: Mac
:synopsis: Basic support for sending Apple Events
:deprecated:
The aetools (|py2stdlib-aetools|) module contains the basic functionality on which Python
AppleScript client support is built. It also imports and re-exports the core
functionality of the aetypes (|py2stdlib-aetypes|) and aepack (|py2stdlib-aepack|) modules. The stub packages
generated by gensuitemodule (|py2stdlib-gensuitemodule|) import the relevant portions of
aetools (|py2stdlib-aetools|), so usually you do not need to import it yourself. The exception
to this is when you cannot use a generated suite package and need lower-level
access to scripting.
The aetools (|py2stdlib-aetools|) module itself uses the AppleEvent support provided by the
Carbon.AE (|py2stdlib-carbon.ae|) module. This has one drawback: you need access to the window
manager, see section osx-gui-scripts for details. This restriction may be
lifted in future releases.
.. note::
This module has been removed in Python 3.x.
The aetools (|py2stdlib-aetools|) module defines the following functions:
packevent(ae, parameters, attributes)~
Stores parameters and attributes in a pre-created ``Carbon.AE.AEDesc`` object.
``parameters`` and ``attributes`` are dictionaries mapping 4-character OSA
parameter keys to Python objects. The objects are packed using
``aepack.pack()``.
unpackevent(ae[, formodulename])~
Recursively unpacks a ``Carbon.AE.AEDesc`` event to Python objects. The function
returns the parameter dictionary and the attribute dictionary. The
``formodulename`` argument is used by generated stub packages to control where
AppleScript classes are looked up.
keysubst(arguments, keydict)~
Converts a Python keyword argument dictionary ``arguments`` to the format
required by ``packevent`` by replacing the keys, which are Python identifiers,
by the four-character OSA keys according to the mapping specified in
``keydict``. Used by the generated suite packages.
enumsubst(arguments, key, edict)~
If the ``arguments`` dictionary contains an entry for ``key`` convert the value
for that entry according to dictionary ``edict``. This converts human-readable
Python enumeration names to the OSA 4-character codes. Used by the generated
suite packages.
The aetools (|py2stdlib-aetools|) module defines the following class:
TalkTo([signature=None, start=0, timeout=0])~
Base class for the proxy used to talk to an application. ``signature`` overrides
the class attribute ``_signature`` (which is usually set by subclasses) and is
the 4-char creator code defining the application to talk to. ``start`` can be
set to true to enable running the application on class instantiation.
``timeout`` can be specified to change the default timeout used while waiting
for an AppleEvent reply.
TalkTo._start()~
Test whether the application is running, and attempt to start it if not.
TalkTo.send(code, subcode[, parameters, attributes])~
Create the AppleEvent ``Carbon.AE.AEDesc`` for the verb with the OSA designation
``code, subcode`` (which are the usual 4-character strings), pack the
``parameters`` and ``attributes`` into it, send it to the target application,
wait for the reply, unpack the reply with ``unpackevent`` and return the reply
appleevent, the unpacked return values as a dictionary and the return
attributes.
==============================================================================
*py2stdlib-aetypes*
aetypes~
:platform: Mac
:synopsis: Python representation of the Apple Event Object Model.
:deprecated:
The aetypes (|py2stdlib-aetypes|) defines classes used to represent Apple Event data
descriptors and Apple Event object specifiers.
Apple Event data is contained in descriptors, and these descriptors are typed.
For many descriptors the Python representation is simply the corresponding
Python type: ``typeText`` in OSA is a Python string, ``typeFloat`` is a float,
etc. For OSA types that have no direct Python counterpart this module declares
classes. Packing and unpacking instances of these classes is handled
automatically by aepack (|py2stdlib-aepack|).
An object specifier is essentially an address of an object implemented in a
Apple Event server. An Apple Event specifier is used as the direct object for an
Apple Event or as the argument of an optional parameter. The aetypes (|py2stdlib-aetypes|)
module contains the base classes for OSA classes and properties, which are used
by the packages generated by gensuitemodule (|py2stdlib-gensuitemodule|) to populate the classes and
properties in a given suite.
For reasons of backward compatibility, and for cases where you need to script an
application for which you have not generated the stub package this module also
contains object specifiers for a number of common OSA classes such as
``Document``, ``Window``, ``Character``, etc.
.. note::
This module has been removed in Python 3.x.
The AEObjects module defines the following classes to represent Apple
Event descriptor data:
Unknown(type, data)~
The representation of OSA descriptor data for which the aepack (|py2stdlib-aepack|) and
aetypes (|py2stdlib-aetypes|) modules have no support, i.e. anything that is not represented by
the other classes here and that is not equivalent to a simple Python value.
Enum(enum)~
An enumeration value with the given 4-character string value.
InsertionLoc(of, pos)~
Position ``pos`` in object ``of``.
Boolean(bool)~
A boolean.
StyledText(style, text)~
Text with style information (font, face, etc) included.
AEText(script, style, text)~
Text with script system and style information included.
IntlText(script, language, text)~
Text with script system and language information included.
IntlWritingCode(script, language)~
Script system and language information.
QDPoint(v, h)~
A quickdraw point.
QDRectangle(v0, h0, v1, h1)~
A quickdraw rectangle.
RGBColor(r, g, b)~
A color.
Type(type)~
An OSA type value with the given 4-character name.
Keyword(name)~
An OSA keyword with the given 4-character name.
Range(start, stop)~
A range.
Ordinal(abso)~
Non-numeric absolute positions, such as ``"firs"``, first, or ``"midd"``,
middle.
Logical(logc, term)~
The logical expression of applying operator ``logc`` to ``term``.
Comparison(obj1, relo, obj2)~
The comparison ``relo`` of ``obj1`` to ``obj2``.
The following classes are used as base classes by the generated stub packages to
represent AppleScript classes and properties in Python:
ComponentItem(which[, fr])~
Abstract baseclass for an OSA class. The subclass should set the class attribute
``want`` to the 4-character OSA class code. Instances of subclasses of this
class are equivalent to AppleScript Object Specifiers. Upon instantiation you
should pass a selector in ``which``, and optionally a parent object in ``fr``.
NProperty(fr)~
Abstract baseclass for an OSA property. The subclass should set the class
attributes ``want`` and ``which`` to designate which property we are talking
about. Instances of subclasses of this class are Object Specifiers.
ObjectSpecifier(want, form, seld[, fr])~
Base class of ``ComponentItem`` and ``NProperty``, a general OSA Object
Specifier. See the Apple Open Scripting Architecture documentation for the
parameters. Note that this class is not abstract.
==============================================================================
*py2stdlib-aifc*
aifc~
:synopsis: Read and write audio files in AIFF or AIFC format.
.. index::
single: Audio Interchange File Format
single: AIFF
single: AIFF-C
This module provides support for reading and writing AIFF and AIFF-C files.
AIFF is Audio Interchange File Format, a format for storing digital audio
samples in a file. AIFF-C is a newer version of the format that includes the
ability to compress the audio data.
.. note::
Some operations may only work under IRIX; these will raise ImportError
when attempting to import the cl module, which is only available on
IRIX.
Audio files have a number of parameters that describe the audio data. The
sampling rate or frame rate is the number of times per second the sound is
sampled. The number of channels indicate if the audio is mono, stereo, or
quadro. Each frame consists of one sample per channel. The sample size is the
size in bytes of each sample. Thus a frame consists of
{nchannels}\{samplesize} bytes, and a second's worth of audio consists of
{nchannels}\{samplesize}\{framerate} bytes.
For example, CD quality audio has a sample size of two bytes (16 bits), uses two
channels (stereo) and has a frame rate of 44,100 frames/second. This gives a
frame size of 4 bytes (2\{2), and a second's worth occupies 2\}2\*44100 bytes
(176,400 bytes).
Module aifc (|py2stdlib-aifc|) defines the following function:
open(file[, mode])~
Open an AIFF or AIFF-C file and return an object instance with methods that are
described below. The argument {file} is either a string naming a file or a file
object. {mode} must be ``'r'`` or ``'rb'`` when the file must be opened for
reading, or ``'w'`` or ``'wb'`` when the file must be opened for writing. If
omitted, ``file.mode`` is used if it exists, otherwise ``'rb'`` is used. When
used for writing, the file object should be seekable, unless you know ahead of
time how many samples you are going to write in total and use
writeframesraw and setnframes.
Objects returned by .open when a file is opened for reading have the
following methods:
aifc.getnchannels()~
Return the number of audio channels (1 for mono, 2 for stereo).
aifc.getsampwidth()~
Return the size in bytes of individual samples.
aifc.getframerate()~
Return the sampling rate (number of audio frames per second).
aifc.getnframes()~
Return the number of audio frames in the file.
aifc.getcomptype()~
Return a four-character string describing the type of compression used in the
audio file. For AIFF files, the returned value is ``'NONE'``.
aifc.getcompname()~
Return a human-readable description of the type of compression used in the audio
file. For AIFF files, the returned value is ``'not compressed'``.
aifc.getparams()~
Return a tuple consisting of all of the above values in the above order.
aifc.getmarkers()~
Return a list of markers in the audio file. A marker consists of a tuple of
three elements. The first is the mark ID (an integer), the second is the mark
position in frames from the beginning of the data (an integer), the third is the
name of the mark (a string).
aifc.getmark(id)~
Return the tuple as described in getmarkers for the mark with the given
{id}.
aifc.readframes(nframes)~
Read and return the next {nframes} frames from the audio file. The returned
data is a string containing for each frame the uncompressed samples of all
channels.
aifc.rewind()~
Rewind the read pointer. The next readframes will start from the
beginning.
aifc.setpos(pos)~
Seek to the specified frame number.
aifc.tell()~
Return the current frame number.
aifc.close()~
Close the AIFF file. After calling this method, the object can no longer be
used.
Objects returned by .open when a file is opened for writing have all the
above methods, except for readframes and setpos. In addition
the following methods exist. The get\* methods can only be called after
the corresponding set\* methods have been called. Before the first
writeframes or writeframesraw, all parameters except for the
number of frames must be filled in.
aifc.aiff()~
Create an AIFF file. The default is that an AIFF-C file is created, unless the
name of the file ends in ``'.aiff'`` in which case the default is an AIFF file.
aifc.aifc()~
Create an AIFF-C file. The default is that an AIFF-C file is created, unless
the name of the file ends in ``'.aiff'`` in which case the default is an AIFF
file.
aifc.setnchannels(nchannels)~
Specify the number of channels in the audio file.
aifc.setsampwidth(width)~
Specify the size in bytes of audio samples.
aifc.setframerate(rate)~
Specify the sampling frequency in frames per second.
aifc.setnframes(nframes)~
Specify the number of frames that are to be written to the audio file. If this
parameter is not set, or not set correctly, the file needs to support seeking.
aifc.setcomptype(type, name)~
.. index::
single: u-LAW
single: A-LAW
single: G.722
Specify the compression type. If not specified, the audio data will not be
compressed. In AIFF files, compression is not possible. The name parameter
should be a human-readable description of the compression type, the type
parameter should be a four-character string. Currently the following
compression types are supported: NONE, ULAW, ALAW, G722.
aifc.setparams(nchannels, sampwidth, framerate, comptype, compname)~
Set all the above parameters at once. The argument is a tuple consisting of the
various parameters. This means that it is possible to use the result of a
getparams call as argument to setparams.
aifc.setmark(id, pos, name)~
Add a mark with the given id (larger than 0), and the given name at the given
position. This method can be called at any time before close.
aifc.tell()~
Return the current write position in the output file. Useful in combination
with setmark.
aifc.writeframes(data)~
Write data to the output file. This method can only be called after the audio
file parameters have been set.
aifc.writeframesraw(data)~
Like writeframes, except that the header of the audio file is not
updated.
aifc.close()~
Close the AIFF file. The header of the file is updated to reflect the actual
size of the audio data. After calling this method, the object can no longer be
used.
==============================================================================
*py2stdlib-al*
al~
:platform: IRIX
:synopsis: Audio functions on the SGI.
:deprecated:
2.6~
The al (|py2stdlib-al|) module has been deprecated for removal in Python 3.0.
This module provides access to the audio facilities of the SGI Indy and Indigo
workstations. See section 3A of the IRIX man pages for details. You'll need to
read those man pages to understand what these functions do! Some of the
functions are not available in IRIX releases before 4.0.5. Again, see the
manual to check whether a specific function is available on your platform.
All functions and methods defined in this module are equivalent to the C
functions with ``AL`` prefixed to their name.
.. index:: module: AL
Symbolic constants from the C header file ``<audio.h>`` are defined in the
standard module AL (|py2stdlib-al^|), see below.
.. warning::
The current version of the audio library may dump core when bad argument values
are passed rather than returning an error status. Unfortunately, since the
precise circumstances under which this may happen are undocumented and hard to
check, the Python interface can provide no protection against this kind of
problems. (One example is specifying an excessive queue size --- there is no
documented upper limit.)
The module defines the following functions:
openport(name, direction[, config])~
The name and direction arguments are strings. The optional {config} argument is
a configuration object as returned by newconfig. The return value is an
audio port object; methods of audio port objects are described below.
newconfig()~
The return value is a new audio configuration object; methods of audio
configuration objects are described below.
queryparams(device)~
The device argument is an integer. The return value is a list of integers
containing the data returned by ALqueryparams.
getparams(device, list)~
The {device} argument is an integer. The list argument is a list such as
returned by queryparams; it is modified in place (!).
setparams(device, list)~
The {device} argument is an integer. The {list} argument is a list such as
returned by queryparams.
Configuration Objects
---------------------
Configuration objects returned by newconfig have the following methods:
audio configuration.getqueuesize()~
Return the queue size.
audio configuration.setqueuesize(size)~
Set the queue size.
audio configuration.getwidth()~
Get the sample width.
audio configuration.setwidth(width)~
Set the sample width.
audio configuration.getchannels()~
Get the channel count.
audio configuration.setchannels(nchannels)~
Set the channel count.
audio configuration.getsampfmt()~
Get the sample format.
audio configuration.setsampfmt(sampfmt)~
Set the sample format.
audio configuration.getfloatmax()~
Get the maximum value for floating sample formats.
audio configuration.setfloatmax(floatmax)~
Set the maximum value for floating sample formats.
Port Objects
------------
Port objects, as returned by openport, have the following methods:
audio port.closeport()~
Close the port.
audio port.getfd()~
Return the file descriptor as an int.
audio port.getfilled()~
Return the number of filled samples.
audio port.getfillable()~
Return the number of fillable samples.
audio port.readsamps(nsamples)~
Read a number of samples from the queue, blocking if necessary. Return the data
as a string containing the raw data, (e.g., 2 bytes per sample in big-endian
byte order (high byte, low byte) if you have set the sample width to 2 bytes).
audio port.writesamps(samples)~
Write samples into the queue, blocking if necessary. The samples are encoded as
described for the readsamps return value.
audio port.getfillpoint()~
Return the 'fill point'.
audio port.setfillpoint(fillpoint)~
Set the 'fill point'.
audio port.getconfig()~
Return a configuration object containing the current configuration of the port.
audio port.setconfig(config)~
Set the configuration from the argument, a configuration object.
audio port.getstatus(list)~
Get status information on last error.
AL (|py2stdlib-al^|) --- Constants used with the al (|py2stdlib-al|) module
======================================================
==============================================================================
*py2stdlib-al^*
AL~
:platform: IRIX
:synopsis: Constants used with the al module.
:deprecated:
2.6~
The AL (|py2stdlib-al^|) module has been deprecated for removal in Python 3.0.
This module defines symbolic constants needed to use the built-in module
al (|py2stdlib-al|) (see above); they are equivalent to those defined in the C header file
``<audio.h>`` except that the name prefix ``AL_`` is omitted. Read the module
source for a complete list of the defined names. Suggested use:: >
import al
from AL import *
==============================================================================
*py2stdlib-anydbm*
anydbm~
:synopsis: Generic interface to DBM-style database modules.
.. note::
The anydbm (|py2stdlib-anydbm|) module has been renamed to dbm (|py2stdlib-dbm|) in Python 3.0. The
2to3 tool will automatically adapt imports when converting your
sources to 3.0.
.. index::
module: dbhash
module: bsddb
module: gdbm
module: dbm
module: dumbdbm
anydbm (|py2stdlib-anydbm|) is a generic interface to variants of the DBM database ---
dbhash (|py2stdlib-dbhash|) (requires bsddb (|py2stdlib-bsddb|)), gdbm (|py2stdlib-gdbm|), or dbm (|py2stdlib-dbm|). If none of
these modules is installed, the slow-but-simple implementation in module
dumbdbm (|py2stdlib-dumbdbm|) will be used.
open(filename[, flag[, mode]])~
Open the database file {filename} and return a corresponding object.
If the database file already exists, the whichdb (|py2stdlib-whichdb|) module is used to
determine its type and the appropriate module is used; if it does not exist,
the first module listed above that can be imported is used.
The optional {flag} argument must be one of these values:
+---------+-------------------------------------------+
| Value | Meaning |
+=========+===========================================+
| ``'r'`` | Open existing database for reading only |
| | (default) |
+---------+-------------------------------------------+
| ``'w'`` | Open existing database for reading and |
| | writing |
+---------+-------------------------------------------+
| ``'c'`` | Open database for reading and writing, |
| | creating it if it doesn't exist |
+---------+-------------------------------------------+
| ``'n'`` | Always create a new, empty database, open |
| | for reading and writing |
+---------+-------------------------------------------+
If not specified, the default value is ``'r'``.
The optional {mode} argument is the Unix mode of the file, used only when the
database has to be created. It defaults to octal ``0666`` (and will be
modified by the prevailing umask).
error~
A tuple containing the exceptions that can be raised by each of the supported
modules, with a unique exception also named anydbm.error as the first
item --- the latter is used when anydbm.error is raised.
The object returned by .open supports most of the same functionality as
dictionaries; keys and their corresponding values can be stored, retrieved, and
deleted, and the has_key and keys methods are available. Keys
and values must always be strings.
The following example records some hostnames and a corresponding title, and
then prints out the contents of the database:: >
import anydbm
# Open database, creating it if necessary.
db = anydbm.open('cache', 'c')
# Record some values
db['www.python.org'] = 'Python Website'
db['www.cnn.com'] = 'Cable News Network'
# Loop through contents. Other dictionary methods
# such as .keys(), .values() also work.
for k, v in db.iteritems():
print k, '\t', v
# Storing a non-string key or value will raise an exception (most
# likely a TypeError).
db['www.yahoo.com'] = 4
# Close when done.
db.close()
<
.. seealso::
Module dbhash (|py2stdlib-dbhash|)
BSD ``db`` database interface.
Module dbm (|py2stdlib-dbm|)
Standard Unix database interface.
Module dumbdbm (|py2stdlib-dumbdbm|)
Portable implementation of the ``dbm`` interface.
Module gdbm (|py2stdlib-gdbm|)
GNU database interface, based on the ``dbm`` interface.
Module shelve (|py2stdlib-shelve|)
General object persistence built on top of the Python ``dbm`` interface.
Module whichdb (|py2stdlib-whichdb|)
Utility module used to determine the type of an existing database.
==============================================================================
*py2stdlib-argparse*
argparse~
:synopsis: Command-line option and argument parsing library.
.. versionadded:: 2.7
The argparse (|py2stdlib-argparse|) module makes it easy to write user friendly command line
interfaces. The program defines what arguments it requires, and argparse (|py2stdlib-argparse|)
will figure out how to parse those out of sys.argv. The argparse (|py2stdlib-argparse|)
module also automatically generates help and usage messages and issues errors
when users give the program invalid arguments.
Example
-------
The following code is a Python program that takes a list of integers and
produces either the sum or the max:: >
import argparse
parser = argparse.ArgumentParser(description='Process some integers.')
parser.add_argument('integers', metavar='N', type=int, nargs='+',
help='an integer for the accumulator')
parser.add_argument('--sum', dest='accumulate', action='store_const',
const=sum, default=max,
help='sum the integers (default: find the max)')
args = parser.parse_args()
print args.accumulate(args.integers)
<
Assuming the Python code above is saved into a file called ``prog.py``, it can
be run at the command line and provides useful help messages:: >
$ prog.py -h
usage: prog.py [-h] [--sum] N [N ...]
Process some integers.
positional arguments:
N an integer for the accumulator
optional arguments:
-h, --help show this help message and exit
--sum sum the integers (default: find the max)
<
When run with the appropriate arguments, it prints either the sum or the max of
the command-line integers:: >
$ prog.py 1 2 3 4
4
$ prog.py 1 2 3 4 --sum
10
<
If invalid arguments are passed in, it will issue an error::
$ prog.py a b c
usage: prog.py [-h] [--sum] N [N ...]
prog.py: error: argument N: invalid int value: 'a'
The following sections walk you through this example.
Creating a parser
^^^^^^^^^^^^^^^^^
The first step in using the argparse (|py2stdlib-argparse|) is creating an
ArgumentParser object:: >
>>> parser = argparse.ArgumentParser(description='Process some integers.')
<
The ArgumentParser object will hold all the information necessary to
parse the command line into python data types.
Adding arguments
^^^^^^^^^^^^^^^^
Filling an ArgumentParser with information about program arguments is
done by making calls to the ArgumentParser.add_argument method.
Generally, these calls tell the ArgumentParser how to take the strings
on the command line and turn them into objects. This information is stored and
used when ArgumentParser.parse_args is called. For example:: >
>>> parser.add_argument('integers', metavar='N', type=int, nargs='+',
... help='an integer for the accumulator')
>>> parser.add_argument('--sum', dest='accumulate', action='store_const',
... const=sum, default=max,
... help='sum the integers (default: find the max)')
<
Later, calling parse_args will return an object with
two attributes, ``integers`` and ``accumulate``. The ``integers`` attribute
will be a list of one or more ints, and the ``accumulate`` attribute will be
either the sum function, if ``--sum`` was specified at the command line,
or the max function if it was not.
Parsing arguments
^^^^^^^^^^^^^^^^^
ArgumentParser parses args through the
ArgumentParser.parse_args method. This will inspect the command-line,
convert each arg to the appropriate type and then invoke the appropriate action.
In most cases, this means a simple namespace object will be built up from
attributes parsed out of the command-line:: >
>>> parser.parse_args(['--sum', '7', '-1', '42'])
Namespace(accumulate=<built-in function sum>, integers=[7, -1, 42])
<
In a script, ArgumentParser.parse_args will typically be called with no
arguments, and the ArgumentParser will automatically determine the
command-line args from sys.argv.
ArgumentParser objects
----------------------
ArgumentParser([description], [epilog], [prog], [usage], [add_help], [argument_default], [parents], [prefix_chars], [conflict_handler], [formatter_class])~
Create a new ArgumentParser object. Each parameter has its own more
detailed description below, but in short they are:
* description_ - Text to display before the argument help.
* epilog_ - Text to display after the argument help.
* add_help_ - Add a -h/--help option to the parser. (default: ``True``)
* argument_default_ - Set the global default value for arguments.
(default: ``None``)
* parents_ - A list of ArgumentParser objects whose arguments should
also be included.
* prefix_chars_ - The set of characters that prefix optional arguments.
(default: '-')
* fromfile_prefix_chars_ - The set of characters that prefix files from
which additional arguments should be read. (default: ``None``)
* formatter_class_ - A class for customizing the help output.
* conflict_handler_ - Usually unnecessary, defines strategy for resolving
conflicting optionals.
* prog_ - The name of the program (default:
sys.argv[0])
* usage_ - The string describing the program usage (default: generated)
The following sections describe how each of these are used.
description
^^^^^^^^^^^
Most calls to the ArgumentParser constructor will use the
``description=`` keyword argument. This argument gives a brief description of
what the program does and how it works. In help messages, the description is
displayed between the command-line usage string and the help messages for the
various arguments:: >
>>> parser = argparse.ArgumentParser(description='A foo that bars')
>>> parser.print_help()
usage: argparse.py [-h]
A foo that bars
optional arguments:
-h, --help show this help message and exit
<
By default, the description will be line-wrapped so that it fits within the
given space. To change this behavior, see the formatter_class_ argument.
epilog
^^^^^^
Some programs like to display additional description of the program after the
description of the arguments. Such text can be specified using the ``epilog=``
argument to ArgumentParser:: >
>>> parser = argparse.ArgumentParser(
... description='A foo that bars',
... epilog="And that's how you'd foo a bar")
>>> parser.print_help()
usage: argparse.py [-h]
A foo that bars
optional arguments:
-h, --help show this help message and exit
And that's how you'd foo a bar
<
As with the description_ argument, the ``epilog=`` text is by default
line-wrapped, but this behavior can be adjusted with the formatter_class_
argument to ArgumentParser.
add_help
^^^^^^^^
By default, ArgumentParser objects add a ``-h/--help`` option which simply
displays the parser's help message. For example, consider a file named
``myprogram.py`` containing the following code:: >
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--foo', help='foo help')
args = parser.parse_args()
<
If ``-h`` or ``--help`` is supplied is at the command-line, the ArgumentParser
help will be printed:: >
$ python myprogram.py --help
usage: myprogram.py [-h] [--foo FOO]
optional arguments:
-h, --help show this help message and exit
--foo FOO foo help
<
Occasionally, it may be useful to disable the addition of this help option.
This can be achieved by passing ``False`` as the ``add_help=`` argument to
ArgumentParser:: >
>>> parser = argparse.ArgumentParser(prog='PROG', add_help=False)
>>> parser.add_argument('--foo', help='foo help')
>>> parser.print_help()
usage: PROG [--foo FOO]
optional arguments:
--foo FOO foo help
<
prefix_chars
Most command-line options will use ``'-'`` as the prefix, e.g. ``-f/--foo``.
Parsers that need to support additional prefix characters, e.g. for options
like ``+f`` or ``/foo``, may specify them using the ``prefix_chars=`` argument
to the ArgumentParser constructor:: >
>>> parser = argparse.ArgumentParser(prog='PROG', prefix_chars='-+')
>>> parser.add_argument('+f')
>>> parser.add_argument('++bar')
>>> parser.parse_args('+f X ++bar Y'.split())
Namespace(bar='Y', f='X')
<
The ``prefix_chars=`` argument defaults to ``'-'``. Supplying a set of
characters that does not include ``'-'`` will cause ``-f/--foo`` options to be
disallowed.
fromfile_prefix_chars
^^^^^^^^^^^^^^^^^^^^^
Sometimes, for example when dealing with a particularly long argument lists, it
may make sense to keep the list of arguments in a file rather than typing it out
at the command line. If the ``fromfile_prefix_chars=`` argument is given to the
ArgumentParser constructor, then arguments that start with any of the
specified characters will be treated as files, and will be replaced by the
arguments they contain. For example:: >
>>> with open('args.txt', 'w') as fp:
... fp.write('-f\nbar')
>>> parser = argparse.ArgumentParser(fromfile_prefix_chars='@')
>>> parser.add_argument('-f')
>>> parser.parse_args(['-f', 'foo', '@args.txt'])
Namespace(f='bar')
<
Arguments read from a file must by default be one per line (but see also
convert_arg_line_to_args) and are treated as if they were in the same
place as the original file referencing argument on the command line. So in the
example above, the expression ``['-f', 'foo', '@args.txt']`` is considered
equivalent to the expression ``['-f', 'foo', '-f', 'bar']``.
The ``fromfile_prefix_chars=`` argument defaults to ``None``, meaning that
arguments will never be treated as file references.
argument_default
^^^^^^^^^^^^^^^^
Generally, argument defaults are specified either by passing a default to
add_argument or by calling the set_defaults methods with a
specific set of name-value pairs. Sometimes however, it may be useful to
specify a single parser-wide default for arguments. This can be accomplished by
passing the ``argument_default=`` keyword argument to ArgumentParser.
For example, to globally suppress attribute creation on parse_args
calls, we supply ``argument_default=SUPPRESS``:: >
>>> parser = argparse.ArgumentParser(argument_default=argparse.SUPPRESS)
>>> parser.add_argument('--foo')
>>> parser.add_argument('bar', nargs='?')
>>> parser.parse_args(['--foo', '1', 'BAR'])
Namespace(bar='BAR', foo='1')
>>> parser.parse_args([])
Namespace()
<
parents
Sometimes, several parsers share a common set of arguments. Rather than
repeating the definitions of these arguments, a single parser with all the
shared arguments and passed to ``parents=`` argument to ArgumentParser
can be used. The ``parents=`` argument takes a list of ArgumentParser
objects, collects all the positional and optional actions from them, and adds
these actions to the ArgumentParser object being constructed:: >
>>> parent_parser = argparse.ArgumentParser(add_help=False)
>>> parent_parser.add_argument('--parent', type=int)
>>> foo_parser = argparse.ArgumentParser(parents=[parent_parser])
>>> foo_parser.add_argument('foo')
>>> foo_parser.parse_args(['--parent', '2', 'XXX'])
Namespace(foo='XXX', parent=2)
>>> bar_parser = argparse.ArgumentParser(parents=[parent_parser])
>>> bar_parser.add_argument('--bar')
>>> bar_parser.parse_args(['--bar', 'YYY'])
Namespace(bar='YYY', parent=None)
<
Note that most parent parsers will specify ``add_help=False``. Otherwise, the
ArgumentParser will see two ``-h/--help`` options (one in the parent
and one in the child) and raise an error.
formatter_class
^^^^^^^^^^^^^^^
ArgumentParser objects allow the help formatting to be customized by
specifying an alternate formatting class. Currently, there are three such
classes: argparse.RawDescriptionHelpFormatter,
argparse.RawTextHelpFormatter and
argparse.ArgumentDefaultsHelpFormatter. The first two allow more
control over how textual descriptions are displayed, while the last
automatically adds information about argument default values.
By default, ArgumentParser objects line-wrap the description_ and
epilog_ texts in command-line help messages:: >
>>> parser = argparse.ArgumentParser(
... prog='PROG',
... description='''this description
... was indented weird
... but that is okay''',
... epilog='''
... likewise for this epilog whose whitespace will
... be cleaned up and whose words will be wrapped
... across a couple lines''')
>>> parser.print_help()
usage: PROG [-h]
this description was indented weird but that is okay
optional arguments:
-h, --help show this help message and exit
likewise for this epilog whose whitespace will be cleaned up and whose words
will be wrapped across a couple lines
<
Passing argparse.RawDescriptionHelpFormatter as ``formatter_class=``
indicates that description_ and epilog_ are already correctly formatted and
should not be line-wrapped:: >
>>> parser = argparse.ArgumentParser(
... prog='PROG',
... formatter_class=argparse.RawDescriptionHelpFormatter,
... description=textwrap.dedent('''\
... Please do not mess up this text!
... --------------------------------
... I have indented it
... exactly the way
... I want it
... '''))
>>> parser.print_help()
usage: PROG [-h]
Please do not mess up this text!
I have indented it
exactly the way
I want it
optional arguments:
-h, --help show this help message and exit
<
RawTextHelpFormatter maintains whitespace for all sorts of help text
including argument descriptions.
The other formatter class available, ArgumentDefaultsHelpFormatter,
will add information about the default value of each of the arguments:: >
>>> parser = argparse.ArgumentParser(
... prog='PROG',
... formatter_class=argparse.ArgumentDefaultsHelpFormatter)
>>> parser.add_argument('--foo', type=int, default=42, help='FOO!')
>>> parser.add_argument('bar', nargs='*', default=[1, 2, 3], help='BAR!')
>>> parser.print_help()
usage: PROG [-h] [--foo FOO] [bar [bar ...]]
positional arguments:
bar BAR! (default: [1, 2, 3])
optional arguments:
-h, --help show this help message and exit
--foo FOO FOO! (default: 42)
<
conflict_handler
ArgumentParser objects do not allow two actions with the same option
string. By default, ArgumentParser objects raises an exception if an
attempt is made to create an argument with an option string that is already in
use:: >
>>> parser = argparse.ArgumentParser(prog='PROG')
>>> parser.add_argument('-f', '--foo', help='old foo help')
>>> parser.add_argument('--foo', help='new foo help')
Traceback (most recent call last):
..
ArgumentError: argument --foo: conflicting option string(s): --foo
<
Sometimes (e.g. when using parents_) it may be useful to simply override any
older arguments with the same option string. To get this behavior, the value
``'resolve'`` can be supplied to the ``conflict_handler=`` argument of
ArgumentParser:: >
>>> parser = argparse.ArgumentParser(prog='PROG', conflict_handler='resolve')
>>> parser.add_argument('-f', '--foo', help='old foo help')
>>> parser.add_argument('--foo', help='new foo help')
>>> parser.print_help()
usage: PROG [-h] [-f FOO] [--foo FOO]
optional arguments:
-h, --help show this help message and exit
-f FOO old foo help
--foo FOO new foo help
<
Note that ArgumentParser objects only remove an action if all of its
option strings are overridden. So, in the example above, the old ``-f/--foo``
action is retained as the ``-f`` action, because only the ``--foo`` option
string was overridden.
prog
^^^^
By default, ArgumentParser objects uses ``sys.argv[0]`` to determine
how to display the name of the program in help messages. This default is almost
always desirable because it will make the help messages match how the program was
invoked on the command line. For example, consider a file named
``myprogram.py`` with the following code:: >
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--foo', help='foo help')
args = parser.parse_args()
<
The help for this program will display ``myprogram.py`` as the program name
(regardless of where the program was invoked from):: >
$ python myprogram.py --help
usage: myprogram.py [-h] [--foo FOO]
optional arguments:
-h, --help show this help message and exit
--foo FOO foo help
$ cd ..
$ python subdir\myprogram.py --help
usage: myprogram.py [-h] [--foo FOO]
optional arguments:
-h, --help show this help message and exit
--foo FOO foo help
<
To change this default behavior, another value can be supplied using the
``prog=`` argument to ArgumentParser:: >
>>> parser = argparse.ArgumentParser(prog='myprogram')
>>> parser.print_help()
usage: myprogram [-h]
optional arguments:
-h, --help show this help message and exit
<
Note that the program name, whether determined from ``sys.argv[0]`` or from the
``prog=`` argument, is available to help messages using the ``%(prog)s`` format
specifier.
:: >
>>> parser = argparse.ArgumentParser(prog='myprogram')
>>> parser.add_argument('--foo', help='foo of the %(prog)s program')
>>> parser.print_help()
usage: myprogram [-h] [--foo FOO]
optional arguments:
-h, --help show this help message and exit
--foo FOO foo of the myprogram program
<
usage
By default, ArgumentParser calculates the usage message from the
arguments it contains:: >
>>> parser = argparse.ArgumentParser(prog='PROG')
>>> parser.add_argument('--foo', nargs='?', help='foo help')
>>> parser.add_argument('bar', nargs='+', help='bar help')
>>> parser.print_help()
usage: PROG [-h] [--foo [FOO]] bar [bar ...]
positional arguments:
bar bar help
optional arguments:
-h, --help show this help message and exit
--foo [FOO] foo help
<
The default message can be overridden with the ``usage=`` keyword argument::
>>> parser = argparse.ArgumentParser(prog='PROG', usage='%(prog)s [options]')
>>> parser.add_argument('--foo', nargs='?', help='foo help')
>>> parser.add_argument('bar', nargs='+', help='bar help')
>>> parser.print_help()
usage: PROG [options]
positional arguments:
bar bar help
optional arguments:
-h, --help show this help message and exit
--foo [FOO] foo help
The ``%(prog)s`` format specifier is available to fill in the program name in
your usage messages.
The add_argument() method
-------------------------
ArgumentParser.add_argument(name or flags..., [action], [nargs], [const], [default], [type], [choices], [required], [help], [metavar], [dest])~
Define how a single command line argument should be parsed. Each parameter
has its own more detailed description below, but in short they are:
* `name or flags`_ - Either a name or a list of option strings, e.g. ``foo``
or ``-f, --foo``
* action_ - The basic type of action to be taken when this argument is
encountered at the command-line.
* nargs_ - The number of command-line arguments that should be consumed.
* const_ - A constant value required by some action_ and nargs_ selections.
* default_ - The value produced if the argument is absent from the
command-line.
* type_ - The type to which the command-line arg should be converted.
* choices_ - A container of the allowable values for the argument.
* required_ - Whether or not the command-line option may be omitted
(optionals only).
* help_ - A brief description of what the argument does.
* metavar_ - A name for the argument in usage messages.
* dest_ - The name of the attribute to be added to the object returned by
parse_args.
The following sections describe how each of these are used.
name or flags
^^^^^^^^^^^^^
The add_argument method must know whether an optional argument, like
``-f`` or ``--foo``, or a positional argument, like a list of filenames, is
expected. The first arguments passed to add_argument must therefore be
either a series of flags, or a simple argument name. For example, an optional
argument could be created like:: >
>>> parser.add_argument('-f', '--foo')
<
while a positional argument could be created like::
>>> parser.add_argument('bar')
When parse_args is called, optional arguments will be identified by the
``-`` prefix, and the remaining arguments will be assumed to be positional:: >
>>> parser = argparse.ArgumentParser(prog='PROG')
>>> parser.add_argument('-f', '--foo')
>>> parser.add_argument('bar')
>>> parser.parse_args(['BAR'])
Namespace(bar='BAR', foo=None)
>>> parser.parse_args(['BAR', '--foo', 'FOO'])
Namespace(bar='BAR', foo='FOO')
>>> parser.parse_args(['--foo', 'FOO'])
usage: PROG [-h] [-f FOO] bar
PROG: error: too few arguments
<
action
ArgumentParser objects associate command-line args with actions. These
actions can do just about anything with the command-line args associated with
them, though most actions simply add an attribute to the object returned by
parse_args. The ``action`` keyword argument specifies how the
command-line args should be handled. The supported actions are:
* ``'store'`` - This just stores the argument's value. This is the default
action. For example:: >
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('--foo')
>>> parser.parse_args('--foo 1'.split())
Namespace(foo='1')
<
* ``'store_const'`` - This stores the value specified by the const_ keyword
argument. (Note that the const_ keyword argument defaults to the rather
unhelpful ``None``.) The ``'store_const'`` action is most commonly used with
optional arguments that specify some sort of flag. For example:: >
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('--foo', action='store_const', const=42)
>>> parser.parse_args('--foo'.split())
Namespace(foo=42)
<
* ``'store_true'`` and ``'store_false'`` - These store the values ``True`` and
``False`` respectively. These are special cases of ``'store_const'``. For
example:: >
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('--foo', action='store_true')
>>> parser.add_argument('--bar', action='store_false')
>>> parser.parse_args('--foo --bar'.split())
Namespace(bar=False, foo=True)
<
* ``'append'`` - This stores a list, and appends each argument value to the
list. This is useful to allow an option to be specified multiple times.
Example usage:: >
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('--foo', action='append')
>>> parser.parse_args('--foo 1 --foo 2'.split())
Namespace(foo=['1', '2'])
<
* ``'append_const'`` - This stores a list, and appends the value specified by
the const_ keyword argument to the list. (Note that the const_ keyword
argument defaults to ``None``.) The ``'append_const'`` action is typically
useful when multiple arguments need to store constants to the same list. For
example:: >
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('--str', dest='types', action='append_const', const=str)
>>> parser.add_argument('--int', dest='types', action='append_const', const=int)
>>> parser.parse_args('--str --int'.split())
Namespace(types=[<type 'str'>, <type 'int'>])
<
* ``'version'`` - This expects a ``version=`` keyword argument in the
add_argument call, and prints version information and exits when
invoked.
>>> import argparse
>>> parser = argparse.ArgumentParser(prog='PROG')
>>> parser.add_argument('--version', action='version', version='%(prog)s 2.0')
>>> parser.parse_args(['--version'])
PROG 2.0
You can also specify an arbitrary action by passing an object that implements
the Action API. The easiest way to do this is to extend
argparse.Action, supplying an appropriate ``__call__`` method. The
``__call__`` method should accept four parameters:
* ``parser`` - The ArgumentParser object which contains this action.
* ``namespace`` - The namespace object that will be returned by
parse_args. Most actions add an attribute to this object.
* ``values`` - The associated command-line args, with any type-conversions
applied. (Type-conversions are specified with the type_ keyword argument to
add_argument.
* ``option_string`` - The option string that was used to invoke this action.
The ``option_string`` argument is optional, and will be absent if the action
is associated with a positional argument.
An example of a custom action:: >
>>> class FooAction(argparse.Action):
... def __call__(self, parser, namespace, values, option_string=None):
... print '%r %r %r' % (namespace, values, option_string)
... setattr(namespace, self.dest, values)
...
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('--foo', action=FooAction)
>>> parser.add_argument('bar', action=FooAction)
>>> args = parser.parse_args('1 --foo 2'.split())
Namespace(bar=None, foo=None) '1' None
Namespace(bar='1', foo=None) '2' '--foo'
>>> args
Namespace(bar='1', foo='2')
<
nargs
ArgumentParser objects usually associate a single command-line argument with a
single action to be taken. The ``nargs`` keyword argument associates a
different number of command-line arguments with a single action.. The supported
values are:
* N (an integer). N args from the command-line will be gathered together into a
list. For example:: >
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('--foo', nargs=2)
>>> parser.add_argument('bar', nargs=1)
>>> parser.parse_args('c --foo a b'.split())
Namespace(bar=['c'], foo=['a', 'b'])
Note that ``nargs=1`` produces a list of one item. This is different from
the default, in which the item is produced by itself.
<
* ``'?'``. One arg will be consumed from the command-line if possible, and
produced as a single item. If no command-line arg is present, the value from
default_ will be produced. Note that for optional arguments, there is an
additional case - the option string is present but not followed by a
command-line arg. In this case the value from const_ will be produced. Some
examples to illustrate this:: >
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('--foo', nargs='?', const='c', default='d')
>>> parser.add_argument('bar', nargs='?', default='d')
>>> parser.parse_args('XX --foo YY'.split())
Namespace(bar='XX', foo='YY')
>>> parser.parse_args('XX --foo'.split())
Namespace(bar='XX', foo='c')
>>> parser.parse_args(''.split())
Namespace(bar='d', foo='d')
One of the more common uses of ``nargs='?'`` is to allow optional input and
output files::
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('infile', nargs='?', type=argparse.FileType('r'), default=sys.stdin)
>>> parser.add_argument('outfile', nargs='?', type=argparse.FileType('w'), default=sys.stdout)
>>> parser.parse_args(['input.txt', 'output.txt'])
Namespace(infile=<open file 'input.txt', mode 'r' at 0x...>, outfile=<open file 'output.txt', mode 'w' at 0x...>)
>>> parser.parse_args([])
Namespace(infile=<open file '<stdin>', mode 'r' at 0x...>, outfile=<open file '<stdout>', mode 'w' at 0x...>)
<
{ ``'}'``. All command-line args present are gathered into a list. Note that
it generally doesn't make much sense to have more than one positional argument
with ``nargs='{'``, but multiple optional arguments with ``nargs='}'`` is
possible. For example:: >
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('--foo', nargs='*')
>>> parser.add_argument('--bar', nargs='*')
>>> parser.add_argument('baz', nargs='*')
>>> parser.parse_args('a b --foo x y --bar 1 2'.split())
Namespace(bar=['1', '2'], baz=['a', 'b'], foo=['x', 'y'])
<
{ ``'+'``. Just like ``'}'``, all command-line args present are gathered into a
list. Additionally, an error message will be generated if there wasn't at
least one command-line arg present. For example:: >
>>> parser = argparse.ArgumentParser(prog='PROG')
>>> parser.add_argument('foo', nargs='+')
>>> parser.parse_args('a b'.split())
Namespace(foo=['a', 'b'])
>>> parser.parse_args(''.split())
usage: PROG [-h] foo [foo ...]
PROG: error: too few arguments
<
If the ``nargs`` keyword argument is not provided, the number of args consumed
is determined by the action_. Generally this means a single command-line arg
will be consumed and a single item (not a list) will be produced.
const
^^^^^
The ``const`` argument of add_argument is used to hold constant values
that are not read from the command line but are required for the various
ArgumentParser actions. The two most common uses of it are:
* When add_argument is called with ``action='store_const'`` or
``action='append_const'``. These actions add the ``const`` value to one of
the attributes of the object returned by parse_args. See the action_
description for examples.
* When add_argument is called with option strings (like ``-f`` or
``--foo``) and ``nargs='?'``. This creates an optional argument that can be
followed by zero or one command-line args. When parsing the command-line, if
the option string is encountered with no command-line arg following it, the
value of ``const`` will be assumed instead. See the nargs_ description for
examples.
The ``const`` keyword argument defaults to ``None``.
default
^^^^^^^
All optional arguments and some positional arguments may be omitted at the
command-line. The ``default`` keyword argument of add_argument, whose
value defaults to ``None``, specifies what value should be used if the
command-line arg is not present. For optional arguments, the ``default`` value
is used when the option string was not present at the command line:: >
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('--foo', default=42)
>>> parser.parse_args('--foo 2'.split())
Namespace(foo='2')
>>> parser.parse_args(''.split())
Namespace(foo=42)
<
For positional arguments with nargs_ ``='?'`` or ``'*'``, the ``default`` value
is used when no command-line arg was present:: >
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('foo', nargs='?', default=42)
>>> parser.parse_args('a'.split())
Namespace(foo='a')
>>> parser.parse_args(''.split())
Namespace(foo=42)
<
Providing ``default=argparse.SUPPRESS`` causes no attribute to be added if the
command-line argument was not present.:: >
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('--foo', default=argparse.SUPPRESS)
>>> parser.parse_args([])
Namespace()
>>> parser.parse_args(['--foo', '1'])
Namespace(foo='1')
<
type
By default, ArgumentParser objects read command-line args in as simple strings.
However, quite often the command-line string should instead be interpreted as
another type, like a float, int or file. The
``type`` keyword argument of add_argument allows any necessary
type-checking and type-conversions to be performed. Many common built-in types
can be used directly as the value of the ``type`` argument:: >
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('foo', type=int)
>>> parser.add_argument('bar', type=file)
>>> parser.parse_args('2 temp.txt'.split())
Namespace(bar=<open file 'temp.txt', mode 'r' at 0x...>, foo=2)
<
To ease the use of various types of files, the argparse module provides the
factory FileType which takes the ``mode=`` and ``bufsize=`` arguments of the
``file`` object. For example, ``FileType('w')`` can be used to create a
writable file:: >
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('bar', type=argparse.FileType('w'))
>>> parser.parse_args(['out.txt'])
Namespace(bar=<open file 'out.txt', mode 'w' at 0x...>)
<
``type=`` can take any callable that takes a single string argument and returns
the type-converted value:: >
>>> def perfect_square(string):
... value = int(string)
... sqrt = math.sqrt(value)
... if sqrt != int(sqrt):
... msg = "%r is not a perfect square" % string
... raise argparse.ArgumentTypeError(msg)
... return value
...
>>> parser = argparse.ArgumentParser(prog='PROG')
>>> parser.add_argument('foo', type=perfect_square)
>>> parser.parse_args('9'.split())
Namespace(foo=9)
>>> parser.parse_args('7'.split())
usage: PROG [-h] foo
PROG: error: argument foo: '7' is not a perfect square
<
The choices_ keyword argument may be more convenient for type checkers that
simply check against a range of values:: >
>>> parser = argparse.ArgumentParser(prog='PROG')
>>> parser.add_argument('foo', type=int, choices=xrange(5, 10))
>>> parser.parse_args('7'.split())
Namespace(foo=7)
>>> parser.parse_args('11'.split())
usage: PROG [-h] {5,6,7,8,9}
PROG: error: argument foo: invalid choice: 11 (choose from 5, 6, 7, 8, 9)
<
See the choices_ section for more details.
choices
^^^^^^^
Some command-line args should be selected from a restricted set of values.
These can be handled by passing a container object as the ``choices`` keyword
argument to add_argument. When the command-line is parsed, arg values
will be checked, and an error message will be displayed if the arg was not one
of the acceptable values:: >
>>> parser = argparse.ArgumentParser(prog='PROG')
>>> parser.add_argument('foo', choices='abc')
>>> parser.parse_args('c'.split())
Namespace(foo='c')
>>> parser.parse_args('X'.split())
usage: PROG [-h] {a,b,c}
PROG: error: argument foo: invalid choice: 'X' (choose from 'a', 'b', 'c')
<
Note that inclusion in the ``choices`` container is checked after any type_
conversions have been performed, so the type of the objects in the ``choices``
container should match the type_ specified:: >
>>> parser = argparse.ArgumentParser(prog='PROG')
>>> parser.add_argument('foo', type=complex, choices=[1, 1j])
>>> parser.parse_args('1j'.split())
Namespace(foo=1j)
>>> parser.parse_args('-- -4'.split())
usage: PROG [-h] {1,1j}
PROG: error: argument foo: invalid choice: (-4+0j) (choose from 1, 1j)
<
Any object that supports the ``in`` operator can be passed as the ``choices``
value, so dict objects, set objects, custom containers,
etc. are all supported.
required
^^^^^^^^
In general, the argparse module assumes that flags like ``-f`` and ``--bar``
indicate {optional} arguments, which can always be omitted at the command-line.
To make an option {required}, ``True`` can be specified for the ``required=``
keyword argument to add_argument:: >
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('--foo', required=True)
>>> parser.parse_args(['--foo', 'BAR'])
Namespace(foo='BAR')
>>> parser.parse_args([])
usage: argparse.py [-h] [--foo FOO]
argparse.py: error: option --foo is required
<
As the example shows, if an option is marked as ``required``, parse_args
will report an error if that option is not present at the command line.
.. note::
Required options are generally considered bad form because users expect
{options} to be {optional}, and thus they should be avoided when possible.
help
^^^^
The ``help`` value is a string containing a brief description of the argument.
When a user requests help (usually by using ``-h`` or ``--help`` at the
command-line), these ``help`` descriptions will be displayed with each
argument:: >
>>> parser = argparse.ArgumentParser(prog='frobble')
>>> parser.add_argument('--foo', action='store_true',
... help='foo the bars before frobbling')
>>> parser.add_argument('bar', nargs='+',
... help='one of the bars to be frobbled')
>>> parser.parse_args('-h'.split())
usage: frobble [-h] [--foo] bar [bar ...]
positional arguments:
bar one of the bars to be frobbled
optional arguments:
-h, --help show this help message and exit
--foo foo the bars before frobbling
<
The ``help`` strings can include various format specifiers to avoid repetition
of things like the program name or the argument default_. The available
specifiers include the program name, ``%(prog)s`` and most keyword arguments to
add_argument, e.g. ``%(default)s``, ``%(type)s``, etc.:: >
>>> parser = argparse.ArgumentParser(prog='frobble')
>>> parser.add_argument('bar', nargs='?', type=int, default=42,
... help='the bar to %(prog)s (default: %(default)s)')
>>> parser.print_help()
usage: frobble [-h] [bar]
positional arguments:
bar the bar to frobble (default: 42)
optional arguments:
-h, --help show this help message and exit
<
metavar
When ArgumentParser generates help messages, it need some way to refer
to each expected argument. By default, ArgumentParser objects use the dest_
value as the "name" of each object. By default, for positional argument
actions, the dest_ value is used directly, and for optional argument actions,
the dest_ value is uppercased. So, a single positional argument with
``dest='bar'`` will that argument will be referred to as ``bar``. A single
optional argument ``--foo`` that should be followed by a single command-line arg
will be referred to as ``FOO``. An example:: >
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('--foo')
>>> parser.add_argument('bar')
>>> parser.parse_args('X --foo Y'.split())
Namespace(bar='X', foo='Y')
>>> parser.print_help()
usage: [-h] [--foo FOO] bar
positional arguments:
bar
optional arguments:
-h, --help show this help message and exit
--foo FOO
<
An alternative name can be specified with ``metavar``::
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('--foo', metavar='YYY')
>>> parser.add_argument('bar', metavar='XXX')
>>> parser.parse_args('X --foo Y'.split())
Namespace(bar='X', foo='Y')
>>> parser.print_help()
usage: [-h] [--foo YYY] XXX
positional arguments:
XXX
optional arguments:
-h, --help show this help message and exit
--foo YYY
Note that ``metavar`` only changes the {displayed} name - the name of the
attribute on the parse_args object is still determined by the dest_
value.
Different values of ``nargs`` may cause the metavar to be used multiple times.
Providing a tuple to ``metavar`` specifies a different display for each of the
arguments:: >
>>> parser = argparse.ArgumentParser(prog='PROG')
>>> parser.add_argument('-x', nargs=2)
>>> parser.add_argument('--foo', nargs=2, metavar=('bar', 'baz'))
>>> parser.print_help()
usage: PROG [-h] [-x X X] [--foo bar baz]
optional arguments:
-h, --help show this help message and exit
-x X X
--foo bar baz
<
dest
Most ArgumentParser actions add some value as an attribute of the
object returned by parse_args. The name of this attribute is determined
by the ``dest`` keyword argument of add_argument. For positional
argument actions, ``dest`` is normally supplied as the first argument to
add_argument:: >
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('bar')
>>> parser.parse_args('XXX'.split())
Namespace(bar='XXX')
<
For optional argument actions, the value of ``dest`` is normally inferred from
the option strings. ArgumentParser generates the value of ``dest`` by
taking the first long option string and stripping away the initial ``'--'``
string. If no long option strings were supplied, ``dest`` will be derived from
the first short option string by stripping the initial ``'-'`` character. Any
internal ``'-'`` characters will be converted to ``'_'`` characters to make sure
the string is a valid attribute name. The examples below illustrate this
behavior:: >
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('-f', '--foo-bar', '--foo')
>>> parser.add_argument('-x', '-y')
>>> parser.parse_args('-f 1 -x 2'.split())
Namespace(foo_bar='1', x='2')
>>> parser.parse_args('--foo 1 -y 2'.split())
Namespace(foo_bar='1', x='2')
<
``dest`` allows a custom attribute name to be provided::
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('--foo', dest='bar')
>>> parser.parse_args('--foo XXX'.split())
Namespace(bar='XXX')
The parse_args() method
-----------------------
ArgumentParser.parse_args([args], [namespace])~
Convert argument strings to objects and assign them as attributes of the
namespace. Return the populated namespace.
Previous calls to add_argument determine exactly what objects are
created and how they are assigned. See the documentation for
add_argument for details.
By default, the arg strings are taken from sys.argv, and a new empty
Namespace object is created for the attributes.
Option value syntax
^^^^^^^^^^^^^^^^^^^
The parse_args method supports several ways of specifying the value of
an option (if it takes one). In the simplest case, the option and its value are
passed as two separate arguments:: >
>>> parser = argparse.ArgumentParser(prog='PROG')
>>> parser.add_argument('-x')
>>> parser.add_argument('--foo')
>>> parser.parse_args('-x X'.split())
Namespace(foo=None, x='X')
>>> parser.parse_args('--foo FOO'.split())
Namespace(foo='FOO', x=None)
<
For long options (options with names longer than a single character), the option
and value can also be passed as a single command line argument, using ``=`` to
separate them:: >
>>> parser.parse_args('--foo=FOO'.split())
Namespace(foo='FOO', x=None)
<
For short options (options only one character long), the option and its value
can be concatenated:: >
>>> parser.parse_args('-xX'.split())
Namespace(foo=None, x='X')
<
Several short options can be joined together, using only a single ``-`` prefix,
as long as only the last option (or none of them) requires a value:: >
>>> parser = argparse.ArgumentParser(prog='PROG')
>>> parser.add_argument('-x', action='store_true')
>>> parser.add_argument('-y', action='store_true')
>>> parser.add_argument('-z')
>>> parser.parse_args('-xyzZ'.split())
Namespace(x=True, y=True, z='Z')
<
Invalid arguments
While parsing the command-line, ``parse_args`` checks for a variety of errors,
including ambiguous options, invalid types, invalid options, wrong number of
positional arguments, etc. When it encounters such an error, it exits and
prints the error along with a usage message:: >
>>> parser = argparse.ArgumentParser(prog='PROG')
>>> parser.add_argument('--foo', type=int)
>>> parser.add_argument('bar', nargs='?')
>>> # invalid type
>>> parser.parse_args(['--foo', 'spam'])
usage: PROG [-h] [--foo FOO] [bar]
PROG: error: argument --foo: invalid int value: 'spam'
>>> # invalid option
>>> parser.parse_args(['--bar'])
usage: PROG [-h] [--foo FOO] [bar]
PROG: error: no such option: --bar
>>> # wrong number of arguments
>>> parser.parse_args(['spam', 'badger'])
usage: PROG [-h] [--foo FOO] [bar]
PROG: error: extra arguments found: badger
<
Arguments containing ``"-"``
The ``parse_args`` method attempts to give errors whenever the user has clearly
made a mistake, but some situations are inherently ambiguous. For example, the
command-line arg ``'-1'`` could either be an attempt to specify an option or an
attempt to provide a positional argument. The ``parse_args`` method is cautious
here: positional arguments may only begin with ``'-'`` if they look like
negative numbers and there are no options in the parser that look like negative
numbers:: >
>>> parser = argparse.ArgumentParser(prog='PROG')
>>> parser.add_argument('-x')
>>> parser.add_argument('foo', nargs='?')
>>> # no negative number options, so -1 is a positional argument
>>> parser.parse_args(['-x', '-1'])
Namespace(foo=None, x='-1')
>>> # no negative number options, so -1 and -5 are positional arguments
>>> parser.parse_args(['-x', '-1', '-5'])
Namespace(foo='-5', x='-1')
>>> parser = argparse.ArgumentParser(prog='PROG')
>>> parser.add_argument('-1', dest='one')
>>> parser.add_argument('foo', nargs='?')
>>> # negative number options present, so -1 is an option
>>> parser.parse_args(['-1', 'X'])
Namespace(foo=None, one='X')
>>> # negative number options present, so -2 is an option
>>> parser.parse_args(['-2'])
usage: PROG [-h] [-1 ONE] [foo]
PROG: error: no such option: -2
>>> # negative number options present, so both -1s are options
>>> parser.parse_args(['-1', '-1'])
usage: PROG [-h] [-1 ONE] [foo]
PROG: error: argument -1: expected one argument
<
If you have positional arguments that must begin with ``'-'`` and don't look
like negative numbers, you can insert the pseudo-argument ``'--'`` which tells
``parse_args`` that everything after that is a positional argument:: >
>>> parser.parse_args(['--', '-f'])
Namespace(foo='-f', one=None)
<
Argument abbreviations
The parse_args method allows long options to be abbreviated if the
abbreviation is unambiguous:: >
>>> parser = argparse.ArgumentParser(prog='PROG')
>>> parser.add_argument('-bacon')
>>> parser.add_argument('-badger')
>>> parser.parse_args('-bac MMM'.split())
Namespace(bacon='MMM', badger=None)
>>> parser.parse_args('-bad WOOD'.split())
Namespace(bacon=None, badger='WOOD')
>>> parser.parse_args('-ba BA'.split())
usage: PROG [-h] [-bacon BACON] [-badger BADGER]
PROG: error: ambiguous option: -ba could match -badger, -bacon
<
An error is produced for arguments that could produce more than one options.
Beyond ``sys.argv``
^^^^^^^^^^^^^^^^^^^
Sometimes it may be useful to have an ArgumentParser parse args other than those
of sys.argv. This can be accomplished by passing a list of strings to
``parse_args``. This is useful for testing at the interactive prompt:: >
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument(
... 'integers', metavar='int', type=int, choices=xrange(10),
... nargs='+', help='an integer in the range 0..9')
>>> parser.add_argument(
... '--sum', dest='accumulate', action='store_const', const=sum,
... default=max, help='sum the integers (default: find the max)')
>>> parser.parse_args(['1', '2', '3', '4'])
Namespace(accumulate=<built-in function max>, integers=[1, 2, 3, 4])
>>> parser.parse_args('1 2 3 4 --sum'.split())
Namespace(accumulate=<built-in function sum>, integers=[1, 2, 3, 4])
<
Custom namespaces
It may also be useful to have an ArgumentParser assign attributes to an
already existing object, rather than the newly-created Namespace object
that is normally used. This can be achieved by specifying the ``namespace=``
keyword argument:: >
>>> class C(object):
... pass
...
>>> c = C()
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('--foo')
>>> parser.parse_args(args=['--foo', 'BAR'], namespace=c)
>>> c.foo
'BAR'
<
Other utilities
Sub-commands
^^^^^^^^^^^^
ArgumentParser.add_subparsers()~
Many programs split up their functionality into a number of sub-commands,
for example, the ``svn`` program can invoke sub-commands like ``svn
checkout``, ``svn update``, and ``svn commit``. Splitting up functionality
this way can be a particularly good idea when a program performs several
different functions which require different kinds of command-line arguments.
ArgumentParser supports the creation of such sub-commands with the
add_subparsers method. The add_subparsers method is normally
called with no arguments and returns an special action object. This object
has a single method, ``add_parser``, which takes a command name and any
ArgumentParser constructor arguments, and returns an
ArgumentParser object that can be modified as usual.
Some example usage:: >
>>> # create the top-level parser
>>> parser = argparse.ArgumentParser(prog='PROG')
>>> parser.add_argument('--foo', action='store_true', help='foo help')
>>> subparsers = parser.add_subparsers(help='sub-command help')
>>>
>>> # create the parser for the "a" command
>>> parser_a = subparsers.add_parser('a', help='a help')
>>> parser_a.add_argument('bar', type=int, help='bar help')
>>>
>>> # create the parser for the "b" command
>>> parser_b = subparsers.add_parser('b', help='b help')
>>> parser_b.add_argument('--baz', choices='XYZ', help='baz help')
>>>
>>> # parse some arg lists
>>> parser.parse_args(['a', '12'])
Namespace(bar=12, foo=False)
>>> parser.parse_args(['--foo', 'b', '--baz', 'Z'])
Namespace(baz='Z', foo=True)
<
Note that the object returned by parse_args will only contain
attributes for the main parser and the subparser that was selected by the
command line (and not any other subparsers). So in the example above, when
the ``"a"`` command is specified, only the ``foo`` and ``bar`` attributes are
present, and when the ``"b"`` command is specified, only the ``foo`` and
``baz`` attributes are present.
Similarly, when a help message is requested from a subparser, only the help
for that particular parser will be printed. The help message will not
include parent parser or sibling parser messages. (A help message for each
subparser command, however, can be given by supplying the ``help=`` argument
to ``add_parser`` as above.)
:: >
>>> parser.parse_args(['--help'])
usage: PROG [-h] [--foo] {a,b} ...
positional arguments:
{a,b} sub-command help
a a help
b b help
optional arguments:
-h, --help show this help message and exit
--foo foo help
>>> parser.parse_args(['a', '--help'])
usage: PROG a [-h] bar
positional arguments:
bar bar help
optional arguments:
-h, --help show this help message and exit
>>> parser.parse_args(['b', '--help'])
usage: PROG b [-h] [--baz {X,Y,Z}]
optional arguments:
-h, --help show this help message and exit
--baz {X,Y,Z} baz help
<
The add_subparsers method also supports ``title`` and ``description``
keyword arguments. When either is present, the subparser's commands will
appear in their own group in the help output. For example:: >
>>> parser = argparse.ArgumentParser()
>>> subparsers = parser.add_subparsers(title='subcommands',
... description='valid subcommands',
... help='additional help')
>>> subparsers.add_parser('foo')
>>> subparsers.add_parser('bar')
>>> parser.parse_args(['-h'])
usage: [-h] {foo,bar} ...
optional arguments:
-h, --help show this help message and exit
subcommands:
valid subcommands
{foo,bar} additional help
<
One particularly effective way of handling sub-commands is to combine the use
of the add_subparsers method with calls to set_defaults so
that each subparser knows which Python function it should execute. For
example:: >
>>> # sub-command functions
>>> def foo(args):
... print args.x * args.y
...
>>> def bar(args):
... print '((%s))' % args.z
...
>>> # create the top-level parser
>>> parser = argparse.ArgumentParser()
>>> subparsers = parser.add_subparsers()
>>>
>>> # create the parser for the "foo" command
>>> parser_foo = subparsers.add_parser('foo')
>>> parser_foo.add_argument('-x', type=int, default=1)
>>> parser_foo.add_argument('y', type=float)
>>> parser_foo.set_defaults(func=foo)
>>>
>>> # create the parser for the "bar" command
>>> parser_bar = subparsers.add_parser('bar')
>>> parser_bar.add_argument('z')
>>> parser_bar.set_defaults(func=bar)
>>>
>>> # parse the args and call whatever function was selected
>>> args = parser.parse_args('foo 1 -x 2'.split())
>>> args.func(args)
2.0
>>>
>>> # parse the args and call whatever function was selected
>>> args = parser.parse_args('bar XYZYX'.split())
>>> args.func(args)
((XYZYX))
<
This way, you can let parse_args does the job of calling the
appropriate function after argument parsing is complete. Associating
functions with actions like this is typically the easiest way to handle the
different actions for each of your subparsers. However, if it is necessary
to check the name of the subparser that was invoked, the ``dest`` keyword
argument to the add_subparsers call will work:: >
>>> parser = argparse.ArgumentParser()
>>> subparsers = parser.add_subparsers(dest='subparser_name')
>>> subparser1 = subparsers.add_parser('1')
>>> subparser1.add_argument('-x')
>>> subparser2 = subparsers.add_parser('2')
>>> subparser2.add_argument('y')
>>> parser.parse_args(['2', 'frobble'])
Namespace(subparser_name='2', y='frobble')
<
FileType objects
FileType(mode='r', bufsize=None)~
The FileType factory creates objects that can be passed to the type
argument of ArgumentParser.add_argument. Arguments that have
FileType objects as their type will open command-line args as files
with the requested modes and buffer sizes:
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('--output', type=argparse.FileType('wb', 0))
>>> parser.parse_args(['--output', 'out'])
Namespace(output=<open file 'out', mode 'wb' at 0x...>)
FileType objects understand the pseudo-argument ``'-'`` and automatically
convert this into ``sys.stdin`` for readable FileType objects and
``sys.stdout`` for writable FileType objects:
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('infile', type=argparse.FileType('r'))
>>> parser.parse_args(['-'])
Namespace(infile=<open file '<stdin>', mode 'r' at 0x...>)
Argument groups
^^^^^^^^^^^^^^^
ArgumentParser.add_argument_group([title], [description])~
By default, ArgumentParser groups command-line arguments into
"positional arguments" and "optional arguments" when displaying help
messages. When there is a better conceptual grouping of arguments than this
default one, appropriate groups can be created using the
add_argument_group method:: >
>>> parser = argparse.ArgumentParser(prog='PROG', add_help=False)
>>> group = parser.add_argument_group('group')
>>> group.add_argument('--foo', help='foo help')
>>> group.add_argument('bar', help='bar help')
>>> parser.print_help()
usage: PROG [--foo FOO] bar
group:
bar bar help
--foo FOO foo help
<
The add_argument_group method returns an argument group object which
has an ArgumentParser.add_argument method just like a regular
ArgumentParser. When an argument is added to the group, the parser
treats it just like a normal argument, but displays the argument in a
separate group for help messages. The add_argument_group method
accepts ``title`` and ``description`` arguments which can be used to
customize this display:: >
>>> parser = argparse.ArgumentParser(prog='PROG', add_help=False)
>>> group1 = parser.add_argument_group('group1', 'group1 description')
>>> group1.add_argument('foo', help='foo help')
>>> group2 = parser.add_argument_group('group2', 'group2 description')
>>> group2.add_argument('--bar', help='bar help')
>>> parser.print_help()
usage: PROG [--bar BAR] foo
group1:
group1 description
foo foo help
group2:
group2 description
--bar BAR bar help
<
Note that any arguments not your user defined groups will end up back in the
usual "positional arguments" and "optional arguments" sections.
Mutual exclusion
^^^^^^^^^^^^^^^^
add_mutually_exclusive_group([required=False])~
Create a mutually exclusive group. argparse will make sure that only one of
the arguments in the mutually exclusive group was present on the command
line:: >
>>> parser = argparse.ArgumentParser(prog='PROG')
>>> group = parser.add_mutually_exclusive_group()
>>> group.add_argument('--foo', action='store_true')
>>> group.add_argument('--bar', action='store_false')
>>> parser.parse_args(['--foo'])
Namespace(bar=True, foo=True)
>>> parser.parse_args(['--bar'])
Namespace(bar=False, foo=False)
>>> parser.parse_args(['--foo', '--bar'])
usage: PROG [-h] [--foo | --bar]
PROG: error: argument --bar: not allowed with argument --foo
<
The add_mutually_exclusive_group method also accepts a ``required``
argument, to indicate that at least one of the mutually exclusive arguments
is required:: >
>>> parser = argparse.ArgumentParser(prog='PROG')
>>> group = parser.add_mutually_exclusive_group(required=True)
>>> group.add_argument('--foo', action='store_true')
>>> group.add_argument('--bar', action='store_false')
>>> parser.parse_args([])
usage: PROG [-h] (--foo | --bar)
PROG: error: one of the arguments --foo --bar is required
<
Note that currently mutually exclusive argument groups do not support the
``title`` and ``description`` arguments of add_argument_group.
Parser defaults
^^^^^^^^^^^^^^^
ArgumentParser.set_defaults({}kwargs)~
Most of the time, the attributes of the object returned by parse_args
will be fully determined by inspecting the command-line args and the argument
actions. ArgumentParser.set_defaults allows some additional
attributes that are determined without any inspection of the command-line to
be added:: >
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('foo', type=int)
>>> parser.set_defaults(bar=42, baz='badger')
>>> parser.parse_args(['736'])
Namespace(bar=42, baz='badger', foo=736)
<
Note that parser-level defaults always override argument-level defaults::
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('--foo', default='bar')
>>> parser.set_defaults(foo='spam')
>>> parser.parse_args([])
Namespace(foo='spam')
Parser-level defaults can be particularly useful when working with multiple
parsers. See the ArgumentParser.add_subparsers method for an
example of this type.
ArgumentParser.get_default(dest)~
Get the default value for a namespace attribute, as set by either
ArgumentParser.add_argument or by
ArgumentParser.set_defaults:: >
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('--foo', default='badger')
>>> parser.get_default('foo')
'badger'
<
Printing help
In most typical applications, parse_args will take care of formatting
and printing any usage or error messages. However, several formatting methods
are available:
ArgumentParser.print_usage([file]):~
Print a brief description of how the ArgumentParser should be
invoked on the command line. If ``file`` is not present, ``sys.stderr`` is
assumed.
ArgumentParser.print_help([file]):~
Print a help message, including the program usage and information about the
arguments registered with the ArgumentParser. If ``file`` is not
present, ``sys.stderr`` is assumed.
There are also variants of these methods that simply return a string instead of
printing it:
ArgumentParser.format_usage():~
Return a string containing a brief description of how the
ArgumentParser should be invoked on the command line.
ArgumentParser.format_help():~
Return a string containing a help message, including the program usage and
information about the arguments registered with the ArgumentParser.
Partial parsing
^^^^^^^^^^^^^^^
ArgumentParser.parse_known_args([args], [namespace])~
Sometimes a script may only parse a few of the command line arguments, passing
the remaining arguments on to another script or program. In these cases, the
parse_known_args method can be useful. It works much like
ArgumentParser.parse_args except that it does not produce an error when
extra arguments are present. Instead, it returns a two item tuple containing
the populated namespace and the list of remaining argument strings.
:: >
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('--foo', action='store_true')
>>> parser.add_argument('bar')
>>> parser.parse_known_args(['--foo', '--badger', 'BAR', 'spam'])
(Namespace(bar='BAR', foo=True), ['--badger', 'spam'])
<
Customizing file parsing
ArgumentParser.convert_arg_line_to_args(arg_line)~
Arguments that are read from a file (see the ``fromfile_prefix_chars``
keyword argument to the ArgumentParser constructor) are read one
argument per line. convert_arg_line_to_args can be overriden for
fancier reading.
This method takes a single argument ``arg_line`` which is a string read from
the argument file. It returns a list of arguments parsed from this string.
The method is called once per line read from the argument file, in order.
A useful override of this method is one that treats each space-separated word
as an argument:: >
def convert_arg_line_to_args(self, arg_line):
for arg in arg_line.split():
if not arg.strip():
continue
yield arg
<
Upgrading optparse code
Originally, the argparse module had attempted to maintain compatibility with
optparse. However, optparse was difficult to extend transparently, particularly
with the changes required to support the new ``nargs=`` specifiers and better
usage messages. When most everything in optparse had either been copy-pasted
over or monkey-patched, it no longer seemed practical to try to maintain the
backwards compatibility.
A partial upgrade path from optparse to argparse:
* Replace all ``add_option()`` calls with ArgumentParser.add_argument calls.
* Replace ``options, args = parser.parse_args()`` with ``args =
parser.parse_args()`` and add additional ArgumentParser.add_argument calls for the
positional arguments.
{ Replace callback actions and the ``callback_}`` keyword arguments with
``type`` or ``action`` arguments.
* Replace string names for ``type`` keyword arguments with the corresponding
type objects (e.g. int, float, complex, etc).
* Replace optparse.Values with Namespace and
optparse.OptionError and optparse.OptionValueError with
ArgumentError.
* Replace strings with implicit arguments such as ``%default`` or ``%prog`` with
the standard python syntax to use dictionaries to format strings, that is,
``%(default)s`` and ``%(prog)s``.
* Replace the OptionParser constructor ``version`` argument with a call to
``parser.add_argument('--version', action='version', version='<the version>')``
==============================================================================
*py2stdlib-array*
array~
:synopsis: Space efficient arrays of uniformly typed numeric values.
.. index:: single: arrays
This module defines an object type which can compactly represent an array of
basic values: characters, integers, floating point numbers. Arrays are sequence
types and behave very much like lists, except that the type of objects stored in
them is constrained. The type is specified at object creation time by using a
type code, which is a single character. The following type codes are
defined:
+-----------+----------------+-------------------+-----------------------+
| Type code | C Type | Python Type | Minimum size in bytes |
+===========+================+===================+=======================+
| ``'c'`` | char | character | 1 |
+-----------+----------------+-------------------+-----------------------+
| ``'b'`` | signed char | int | 1 |
+-----------+----------------+-------------------+-----------------------+
| ``'B'`` | unsigned char | int | 1 |
+-----------+----------------+-------------------+-----------------------+
| ``'u'`` | Py_UNICODE | Unicode character | 2 (see note) |
+-----------+----------------+-------------------+-----------------------+
| ``'h'`` | signed short | int | 2 |
+-----------+----------------+-------------------+-----------------------+
| ``'H'`` | unsigned short | int | 2 |
+-----------+----------------+-------------------+-----------------------+
| ``'i'`` | signed int | int | 2 |
+-----------+----------------+-------------------+-----------------------+
| ``'I'`` | unsigned int | long | 2 |
+-----------+----------------+-------------------+-----------------------+
| ``'l'`` | signed long | int | 4 |
+-----------+----------------+-------------------+-----------------------+
| ``'L'`` | unsigned long | long | 4 |
+-----------+----------------+-------------------+-----------------------+
| ``'f'`` | float | float | 4 |
+-----------+----------------+-------------------+-----------------------+
| ``'d'`` | double | float | 8 |
+-----------+----------------+-------------------+-----------------------+
.. note::
The ``'u'`` typecode corresponds to Python's unicode character. On narrow
Unicode builds this is 2-bytes, on wide builds this is 4-bytes.
The actual representation of values is determined by the machine architecture
(strictly speaking, by the C implementation). The actual size can be accessed
through the itemsize attribute. The values stored for ``'L'`` and
``'I'`` items will be represented as Python long integers when retrieved,
because Python's plain integer type cannot represent the full range of C's
unsigned (long) integers.
The module defines the following type:
array(typecode[, initializer])~
A new array whose items are restricted by {typecode}, and initialized
from the optional {initializer} value, which must be a list, string, or iterable
over elements of the appropriate type.
.. versionchanged:: 2.4
Formerly, only lists or strings were accepted.
If given a list or string, the initializer is passed to the new array's
fromlist, fromstring, or fromunicode method (see below)
to add initial items to the array. Otherwise, the iterable initializer is
passed to the extend method.
ArrayType~
Obsolete alias for array (|py2stdlib-array|).
Array objects support the ordinary sequence operations of indexing, slicing,
concatenation, and multiplication. When using slice assignment, the assigned
value must be an array object with the same type code; in all other cases,
TypeError is raised. Array objects also implement the buffer interface,
and may be used wherever buffer objects are supported.
The following data items and methods are also supported:
array.typecode~
The typecode character used to create the array.
array.itemsize~
The length in bytes of one array item in the internal representation.
array.append(x)~
Append a new item with value {x} to the end of the array.
array.buffer_info()~
Return a tuple ``(address, length)`` giving the current memory address and the
length in elements of the buffer used to hold array's contents. The size of the
memory buffer in bytes can be computed as ``array.buffer_info()[1] *
array.itemsize``. This is occasionally useful when working with low-level (and
inherently unsafe) I/O interfaces that require memory addresses, such as certain
ioctl operations. The returned numbers are valid as long as the array
exists and no length-changing operations are applied to it.
.. note:: >
When using array objects from code written in C or C++ (the only way to
effectively make use of this information), it makes more sense to use the buffer
interface supported by array objects. This method is maintained for backward
compatibility and should be avoided in new code. The buffer interface is
documented in bufferobjects.
<
array.byteswap()~
"Byteswap" all items of the array. This is only supported for values which are
1, 2, 4, or 8 bytes in size; for other types of values, RuntimeError is
raised. It is useful when reading data from a file written on a machine with a
different byte order.
array.count(x)~
Return the number of occurrences of {x} in the array.
array.extend(iterable)~
Append items from {iterable} to the end of the array. If {iterable} is another
array, it must have {exactly} the same type code; if not, TypeError will
be raised. If {iterable} is not an array, it must be iterable and its elements
must be the right type to be appended to the array.
.. versionchanged:: 2.4
Formerly, the argument could only be another array.
array.fromfile(f, n)~
Read {n} items (as machine values) from the file object {f} and append them to
the end of the array. If less than {n} items are available, EOFError is
raised, but the items that were available are still inserted into the array.
{f} must be a real built-in file object; something else with a read
method won't do.
array.fromlist(list)~
Append items from the list. This is equivalent to ``for x in list:
a.append(x)`` except that if there is a type error, the array is unchanged.
array.fromstring(s)~
Appends items from the string, interpreting the string as an array of machine
values (as if it had been read from a file using the fromfile method).
array.fromunicode(s)~
Extends this array with data from the given unicode string. The array must
be a type ``'u'`` array; otherwise a ValueError is raised. Use
``array.fromstring(unicodestring.encode(enc))`` to append Unicode data to an
array of some other type.
array.index(x)~
Return the smallest {i} such that {i} is the index of the first occurrence of
{x} in the array.
array.insert(i, x)~
Insert a new item with value {x} in the array before position {i}. Negative
values are treated as being relative to the end of the array.
array.pop([i])~
Removes the item with the index {i} from the array and returns it. The optional
argument defaults to ``-1``, so that by default the last item is removed and
returned.
array.read(f, n)~
1.5.1~
Use the fromfile method.
Read {n} items (as machine values) from the file object {f} and append them to
the end of the array. If less than {n} items are available, EOFError is
raised, but the items that were available are still inserted into the array.
{f} must be a real built-in file object; something else with a read
method won't do.
array.remove(x)~
Remove the first occurrence of {x} from the array.
array.reverse()~
Reverse the order of the items in the array.
array.tofile(f)~
Write all items (as machine values) to the file object {f}.
array.tolist()~
Convert the array to an ordinary list with the same items.
array.tostring()~
Convert the array to an array of machine values and return the string
representation (the same sequence of bytes that would be written to a file by
the tofile method.)
array.tounicode()~
Convert the array to a unicode string. The array must be a type ``'u'`` array;
otherwise a ValueError is raised. Use ``array.tostring().decode(enc)`` to
obtain a unicode string from an array of some other type.
array.write(f)~
1.5.1~
Use the tofile method.
Write all items (as machine values) to the file object {f}.
When an array object is printed or converted to a string, it is represented as
``array(typecode, initializer)``. The {initializer} is omitted if the array is
empty, otherwise it is a string if the {typecode} is ``'c'``, otherwise it is a
list of numbers. The string is guaranteed to be able to be converted back to an
array with the same type and value using eval, so long as the
array (|py2stdlib-array|) function has been imported using ``from array import array``.
Examples:: >
array('l')
array('c', 'hello world')
array('u', u'hello \u2641')
array('l', [1, 2, 3, 4, 5])
array('d', [1.0, 2.0, 3.14])
<
.. seealso::
Module struct (|py2stdlib-struct|)
Packing and unpacking of heterogeneous binary data.
Module xdrlib (|py2stdlib-xdrlib|)
Packing and unpacking of External Data Representation (XDR) data as used in some
remote procedure call systems.
`The Numerical Python Manual <http://numpy.sourceforge.net/numdoc/HTML/numdoc.htm>`_
The Numeric Python extension (NumPy) defines another array type; see
http://numpy.sourceforge.net/ for further information about Numerical Python.
(A PDF version of the NumPy manual is available at
http://numpy.sourceforge.net/numdoc/numdoc.pdf).
==============================================================================
*py2stdlib-ast*
ast~
:synopsis: Abstract Syntax Tree classes and manipulation.
.. versionadded:: 2.5
The low-level ``_ast`` module containing only the node classes.
.. versionadded:: 2.6
The high-level ``ast`` module containing all helpers.
The ast (|py2stdlib-ast|) module helps Python applications to process trees of the Python
abstract syntax grammar. The abstract syntax itself might change with each
Python release; this module helps to find out programmatically what the current
grammar looks like.
An abstract syntax tree can be generated by passing ast.PyCF_ONLY_AST as
a flag to the compile built-in function, or using the parse
helper provided in this module. The result will be a tree of objects whose
classes all inherit from ast.AST. An abstract syntax tree can be
compiled into a Python code object using the built-in compile function.
Node classes
------------
AST~
This is the base of all AST node classes. The actual node classes are
derived from the Parser/Python.asdl file, which is reproduced
below <abstract-grammar>. They are defined in the _ast C
module and re-exported in ast (|py2stdlib-ast|).
There is one class defined for each left-hand side symbol in the abstract
grammar (for example, ast.stmt or ast.expr). In addition,
there is one class defined for each constructor on the right-hand side; these
classes inherit from the classes for the left-hand side trees. For example,
ast.BinOp inherits from ast.expr. For production rules
with alternatives (aka "sums"), the left-hand side class is abstract: only
instances of specific constructor nodes are ever created.
_fields~
Each concrete class has an attribute _fields which gives the names
of all child nodes.
Each instance of a concrete class has one attribute for each child node,
of the type as defined in the grammar. For example, ast.BinOp
instances have an attribute left of type ast.expr.
If these attributes are marked as optional in the grammar (using a
question mark), the value might be ``None``. If the attributes can have
zero-or-more values (marked with an asterisk), the values are represented
as Python lists. All possible attributes must be present and have valid
values when compiling an AST with compile.
lineno~
col_offset
Instances of ast.expr and ast.stmt subclasses have
lineno and col_offset attributes. The lineno is
the line number of source text (1-indexed so the first line is line 1) and
the col_offset is the UTF-8 byte offset of the first token that
generated the node. The UTF-8 offset is recorded because the parser uses
UTF-8 internally.
The constructor of a class ast.T parses its arguments as follows:
* If there are positional arguments, there must be as many as there are items
in T._fields; they will be assigned as attributes of these names.
* If there are keyword arguments, they will set the attributes of the same
names to the given values.
For example, to create and populate an ast.UnaryOp node, you could
use :: >
node = ast.UnaryOp()
node.op = ast.USub()
node.operand = ast.Num()
node.operand.n = 5
node.operand.lineno = 0
node.operand.col_offset = 0
node.lineno = 0
node.col_offset = 0
<
or the more compact ::
node = ast.UnaryOp(ast.USub(), ast.Num(5, lineno=0, col_offset=0),
lineno=0, col_offset=0)
.. versionadded:: 2.6
The constructor as explained above was added. In Python 2.5 nodes had
to be created by calling the class constructor without arguments and
setting the attributes afterwards.
Abstract Grammar
----------------
The module defines a string constant ``__version__`` which is the decimal
Subversion revision number of the file shown below.
The abstract grammar is currently defined as follows:
.. literalinclude:: ../../Parser/Python.asdl
ast (|py2stdlib-ast|) Helpers
------------------
.. versionadded:: 2.6
Apart from the node classes, ast (|py2stdlib-ast|) module defines these utility functions
and classes for traversing abstract syntax trees:
parse(expr, filename='<unknown>', mode='exec')~
Parse an expression into an AST node. Equivalent to ``compile(expr,
filename, mode, ast.PyCF_ONLY_AST)``.
literal_eval(node_or_string)~
Safely evaluate an expression node or a string containing a Python
expression. The string or node provided may only consist of the following
Python literal structures: strings, numbers, tuples, lists, dicts, booleans,
and ``None``.
This can be used for safely evaluating strings containing Python expressions
from untrusted sources without the need to parse the values oneself.
get_docstring(node, clean=True)~
Return the docstring of the given {node} (which must be a
FunctionDef, ClassDef or Module node), or ``None``
if it has no docstring. If {clean} is true, clean up the docstring's
indentation with inspect.cleandoc.
fix_missing_locations(node)~
When you compile a node tree with compile, the compiler expects
lineno and col_offset attributes for every node that supports
them. This is rather tedious to fill in for generated nodes, so this helper
adds these attributes recursively where not already set, by setting them to
the values of the parent node. It works recursively starting at {node}.
increment_lineno(node, n=1)~
Increment the line number of each node in the tree starting at {node} by {n}.
This is useful to "move code" to a different location in a file.
copy_location(new_node, old_node)~
Copy source location (lineno and col_offset) from {old_node}
to {new_node} if possible, and return {new_node}.
iter_fields(node)~
Yield a tuple of ``(fieldname, value)`` for each field in ``node._fields``
that is present on {node}.
iter_child_nodes(node)~
Yield all direct child nodes of {node}, that is, all fields that are nodes
and all items of fields that are lists of nodes.
walk(node)~
Recursively yield all child nodes of {node}, in no specified order. This is
useful if you only want to modify nodes in place and don't care about the
context.
NodeVisitor()~
A node visitor base class that walks the abstract syntax tree and calls a
visitor function for every node found. This function may return a value
which is forwarded by the visit method.
This class is meant to be subclassed, with the subclass adding visitor
methods.
visit(node)~
Visit a node. The default implementation calls the method called
self.visit_{classname} where {classname} is the name of the node
class, or generic_visit if that method doesn't exist.
generic_visit(node)~
This visitor calls visit on all children of the node.
Note that child nodes of nodes that have a custom visitor method won't be
visited unless the visitor calls generic_visit or visits them
itself.
Don't use the NodeVisitor if you want to apply changes to nodes
during traversal. For this a special visitor exists
(NodeTransformer) that allows modifications.
NodeTransformer()~
A NodeVisitor subclass that walks the abstract syntax tree and
allows modification of nodes.
The NodeTransformer will walk the AST and use the return value of
the visitor methods to replace or remove the old node. If the return value
of the visitor method is ``None``, the node will be removed from its
location, otherwise it is replaced with the return value. The return value
may be the original node in which case no replacement takes place.
Here is an example transformer that rewrites all occurrences of name lookups
(``foo``) to ``data['foo']``:: >
class RewriteName(NodeTransformer):
def visit_Name(self, node):
return copy_location(Subscript(
value=Name(id='data', ctx=Load()),
slice=Index(value=Str(s=node.id)),
ctx=node.ctx
), node)
<
Keep in mind that if the node you're operating on has child nodes you must
either transform the child nodes yourself or call the generic_visit
method for the node first.
For nodes that were part of a collection of statements (that applies to all
statement nodes), the visitor may also return a list of nodes rather than
just a single node.
Usually you use the transformer like this:: >
node = YourTransformer().visit(node)
<
dump(node, annotate_fields=True, include_attributes=False)~
Return a formatted dump of the tree in {node}. This is mainly useful for
debugging purposes. The returned string will show the names and the values
for fields. This makes the code impossible to evaluate, so if evaluation is
wanted {annotate_fields} must be set to False. Attributes such as line
numbers and column offsets are not dumped by default. If this is wanted,
{include_attributes} can be set to ``True``.
==============================================================================
*py2stdlib-asynchat*
asynchat~
:synopsis: Support for asynchronous command/response protocols.
This module builds on the asyncore (|py2stdlib-asyncore|) infrastructure, simplifying
asynchronous clients and servers and making it easier to handle protocols
whose elements are terminated by arbitrary strings, or are of variable length.
asynchat (|py2stdlib-asynchat|) defines the abstract class async_chat that you
subclass, providing implementations of the collect_incoming_data and
found_terminator methods. It uses the same asynchronous loop as
and asynchat.async_chat, can freely be mixed in the channel map.
Typically an asyncore.dispatcher server channel generates new
asynchat.async_chat channel objects as it receives incoming
connection requests.
async_chat()~
This class is an abstract subclass of asyncore.dispatcher. To make
practical use of the code you must subclass async_chat, providing
meaningful collect_incoming_data and found_terminator
methods.
The asyncore.dispatcher methods can be used, although not all make
sense in a message/response context.
Like asyncore.dispatcher, async_chat defines a set of
events that are generated by an analysis of socket conditions after a
select (|py2stdlib-select|) call. Once the polling loop has been started the
async_chat object's methods are called by the event-processing
framework with no action on the part of the programmer.
Two class attributes can be modified, to improve performance, or possibly
even to conserve memory.
ac_in_buffer_size~
The asynchronous input buffer size (default ``4096``).
ac_out_buffer_size~
The asynchronous output buffer size (default ``4096``).
Unlike asyncore.dispatcher, async_chat allows you to
define a first-in-first-out queue (fifo) of {producers}. A producer need
have only one method, more, which should return data to be
transmitted on the channel.
The producer indicates exhaustion ({i.e.} that it contains no more data) by
having its more method return the empty string. At this point the
async_chat object removes the producer from the fifo and starts
using the next producer, if any. When the producer fifo is empty the
handle_write method does nothing. You use the channel object's
set_terminator method to describe how to recognize the end of, or
an important breakpoint in, an incoming transmission from the remote
endpoint.
To build a functioning async_chat subclass your input methods
collect_incoming_data and found_terminator must handle the
data that the channel receives asynchronously. The methods are described
below.
async_chat.close_when_done()~
Pushes a ``None`` on to the producer fifo. When this producer is popped off
the fifo it causes the channel to be closed.
async_chat.collect_incoming_data(data)~
Called with {data} holding an arbitrary amount of received data. The
default method, which must be overridden, raises a
NotImplementedError exception.
async_chat.discard_buffers()~
In emergencies this method will discard any data held in the input and/or
output buffers and the producer fifo.
async_chat.found_terminator()~
Called when the incoming data stream matches the termination condition set
by set_terminator. The default method, which must be overridden,
raises a NotImplementedError exception. The buffered input data
should be available via an instance attribute.
async_chat.get_terminator()~
Returns the current terminator for the channel.
async_chat.push(data)~
Pushes data on to the channel's fifo to ensure its transmission.
This is all you need to do to have the channel write the data out to the
network, although it is possible to use your own producers in more complex
schemes to implement encryption and chunking, for example.
async_chat.push_with_producer(producer)~
Takes a producer object and adds it to the producer fifo associated with
the channel. When all currently-pushed producers have been exhausted the
channel will consume this producer's data by calling its more
method and send the data to the remote endpoint.
async_chat.set_terminator(term)~
Sets the terminating condition to be recognized on the channel. ``term``
may be any of three types of value, corresponding to three different ways
to handle incoming protocol data.
+-----------+---------------------------------------------+
| term | Description |
+===========+=============================================+
| {string} | Will call found_terminator when the |
| | string is found in the input stream |
+-----------+---------------------------------------------+
| {integer} | Will call found_terminator when the |
| | indicated number of characters have been |
| | received |
+-----------+---------------------------------------------+
| ``None`` | The channel continues to collect data |
| | forever |
+-----------+---------------------------------------------+
Note that any data following the terminator will be available for reading
by the channel after found_terminator is called.
asynchat - Auxiliary Classes
----------------------------
fifo([list=None])~
A fifo holding data which has been pushed by the application but
not yet popped for writing to the channel. A fifo is a list used
to hold data and/or producers until they are required. If the {list}
argument is provided then it should contain producers or data items to be
written to the channel.
is_empty()~
Returns ``True`` if and only if the fifo is empty.
first()~
Returns the least-recently push\ ed item from the fifo.
push(data)~
Adds the given data (which may be a string or a producer object) to the
producer fifo.
pop()~
If the fifo is not empty, returns ``True, first()``, deleting the popped
item. Returns ``False, None`` for an empty fifo.
asynchat Example
----------------
The following partial example shows how HTTP requests can be read with
async_chat. A web server might create an
http_request_handler object for each incoming client connection.
Notice that initially the channel terminator is set to match the blank line at
the end of the HTTP headers, and a flag indicates that the headers are being
read.
Once the headers have been read, if the request is of type POST (indicating
that further data are present in the input stream) then the
``Content-Length:`` header is used to set a numeric terminator to read the
right amount of data from the channel.
The handle_request method is called once all relevant input has been
marshalled, after setting the channel terminator to ``None`` to ensure that
any extraneous data sent by the web client are ignored. :: >
class http_request_handler(asynchat.async_chat):
def __init__(self, sock, addr, sessions, log):
asynchat.async_chat.__init__(self, sock=sock)
self.addr = addr
self.sessions = sessions
self.ibuffer = []
self.obuffer = ""
self.set_terminator("\r\n\r\n")
self.reading_headers = True
self.handling = False
self.cgi_data = None
self.log = log
def collect_incoming_data(self, data):
"""Buffer the data"""
self.ibuffer.append(data)
def found_terminator(self):
if self.reading_headers:
self.reading_headers = False
self.parse_headers("".join(self.ibuffer))
self.ibuffer = []
if self.op.upper() == "POST":
clen = self.headers.getheader("content-length")
self.set_terminator(int(clen))
else:
self.handling = True
self.set_terminator(None)
self.handle_request()
elif not self.handling:
self.set_terminator(None) # browsers sometimes over-send
self.cgi_data = parse(self.headers, "".join(self.ibuffer))
self.handling = True
self.ibuffer = []
self.handle_request()
==============================================================================
*py2stdlib-asyncore*
asyncore~
:synopsis: A base class for developing asynchronous socket handling
services.
.. heavily adapted from original documentation by Sam Rushing
This module provides the basic infrastructure for writing asynchronous socket
service clients and servers.
There are only two ways to have a program on a single processor do "more than
one thing at a time." Multi-threaded programming is the simplest and most
popular way to do it, but there is another very different technique, that lets
you have nearly all the advantages of multi-threading, without actually using
multiple threads. It's really only practical if your program is largely I/O
bound. If your program is processor bound, then pre-emptive scheduled threads
are probably what you really need. Network servers are rarely processor
bound, however.
If your operating system supports the select (|py2stdlib-select|) system call in its I/O
library (and nearly all do), then you can use it to juggle multiple
communication channels at once; doing other work while your I/O is taking
place in the "background." Although this strategy can seem strange and
complex, especially at first, it is in many ways easier to understand and
control than multi-threaded programming. The asyncore (|py2stdlib-asyncore|) module solves
many of the difficult problems for you, making the task of building
sophisticated high-performance network servers and clients a snap. For
"conversational" applications and protocols the companion asynchat (|py2stdlib-asynchat|)
module is invaluable.
The basic idea behind both modules is to create one or more network
{channels}, instances of class asyncore.dispatcher and
asynchat.async_chat. Creating the channels adds them to a global
map, used by the loop function if you do not provide it with your own
{map}.
Once the initial channel(s) is(are) created, calling the loop function
activates channel service, which continues until the last channel (including
any that have been added to the map during asynchronous service) is closed.
loop([timeout[, use_poll[, map[,count]]]])~
Enter a polling loop that terminates after count passes or all open
channels have been closed. All arguments are optional. The {count}
parameter defaults to None, resulting in the loop terminating only when all
channels have been closed. The {timeout} argument sets the timeout
parameter for the appropriate select (|py2stdlib-select|) or poll call, measured
in seconds; the default is 30 seconds. The {use_poll} parameter, if true,
indicates that poll should be used in preference to select (|py2stdlib-select|)
(the default is ``False``).
The {map} parameter is a dictionary whose items are the channels to watch.
As channels are closed they are deleted from their map. If {map} is
omitted, a global map is used. Channels (instances of
asyncore.dispatcher, asynchat.async_chat and subclasses
thereof) can freely be mixed in the map.
dispatcher()~
The dispatcher class is a thin wrapper around a low-level socket
object. To make it more useful, it has a few methods for event-handling
which are called from the asynchronous loop. Otherwise, it can be treated
as a normal non-blocking socket object.
The firing of low-level events at certain times or in certain connection
states tells the asynchronous loop that certain higher-level events have
taken place. For example, if we have asked for a socket to connect to
another host, we know that the connection has been made when the socket
becomes writable for the first time (at this point you know that you may
write to it with the expectation of success). The implied higher-level
events are:
+----------------------+----------------------------------------+
| Event | Description |
+======================+========================================+
| ``handle_connect()`` | Implied by the first read or write |
| | event |
+----------------------+----------------------------------------+
| ``handle_close()`` | Implied by a read event with no data |
| | available |
+----------------------+----------------------------------------+
| ``handle_accept()`` | Implied by a read event on a listening |
| | socket |
+----------------------+----------------------------------------+
During asynchronous processing, each mapped channel's readable and
writable methods are used to determine whether the channel's socket
should be added to the list of channels select (|py2stdlib-select|)\ ed or
poll\ ed for read and write events.
Thus, the set of channel events is larger than the basic socket events. The
full set of methods that can be overridden in your subclass follows:
handle_read()~
Called when the asynchronous loop detects that a read call on the
channel's socket will succeed.
handle_write()~
Called when the asynchronous loop detects that a writable socket can be
written. Often this method will implement the necessary buffering for
performance. For example:: >
def handle_write(self):
sent = self.send(self.buffer)
self.buffer = self.buffer[sent:]
<
handle_expt()~
Called when there is out of band (OOB) data for a socket connection. This
will almost never happen, as OOB is tenuously supported and rarely used.
handle_connect()~
Called when the active opener's socket actually makes a connection. Might
send a "welcome" banner, or initiate a protocol negotiation with the
remote endpoint, for example.
handle_close()~
Called when the socket is closed.
handle_error()~
Called when an exception is raised and not otherwise handled. The default
version prints a condensed traceback.
handle_accept()~
Called on listening channels (passive openers) when a connection can be
established with a new remote endpoint that has issued a connect
call for the local endpoint.
readable()~
Called each time around the asynchronous loop to determine whether a
channel's socket should be added to the list on which read events can
occur. The default method simply returns ``True``, indicating that by
default, all channels will be interested in read events.
writable()~
Called each time around the asynchronous loop to determine whether a
channel's socket should be added to the list on which write events can
occur. The default method simply returns ``True``, indicating that by
default, all channels will be interested in write events.
In addition, each channel delegates or extends many of the socket methods.
Most of these are nearly identical to their socket partners.
create_socket(family, type)~
This is identical to the creation of a normal socket, and will use the
same options for creation. Refer to the socket (|py2stdlib-socket|) documentation for
information on creating sockets.
connect(address)~
As with the normal socket object, {address} is a tuple with the first
element the host to connect to, and the second the port number.
send(data)~
Send {data} to the remote end-point of the socket.
recv(buffer_size)~
Read at most {buffer_size} bytes from the socket's remote end-point. An
empty string implies that the channel has been closed from the other end.
listen(backlog)~
Listen for connections made to the socket. The {backlog} argument
specifies the maximum number of queued connections and should be at least
1; the maximum value is system-dependent (usually 5).
bind(address)~
Bind the socket to {address}. The socket must not already be bound. (The
format of {address} depends on the address family --- refer to the
socket (|py2stdlib-socket|) documentation for more information.) To mark
the socket as re-usable (setting the SO_REUSEADDR option), call
the dispatcher object's set_reuse_addr method.
accept()~
Accept a connection. The socket must be bound to an address and listening
for connections. The return value is a pair ``(conn, address)`` where
{conn} is a {new} socket object usable to send and receive data on the
connection, and {address} is the address bound to the socket on the other
end of the connection.
close()~
Close the socket. All future operations on the socket object will fail.
The remote end-point will receive no more data (after queued data is
flushed). Sockets are automatically closed when they are
garbage-collected.
file_dispatcher()~
A file_dispatcher takes a file descriptor or file object along with an
optional map argument and wraps it for use with the poll or
loop functions. If provided a file object or anything with a
fileno method, that method will be called and passed to the
file_wrapper constructor. Availability: UNIX.
file_wrapper()~
A file_wrapper takes an integer file descriptor and calls os.dup to
duplicate the handle so that the original handle may be closed independently
of the file_wrapper. This class implements sufficient methods to emulate a
socket for use by the file_dispatcher class. Availability: UNIX.
asyncore Example basic HTTP client
----------------------------------
Here is a very basic HTTP client that uses the dispatcher class to
implement its socket handling:: >
import asyncore, socket
class http_client(asyncore.dispatcher):
def __init__(self, host, path):
asyncore.dispatcher.__init__(self)
self.create_socket(socket.AF_INET, socket.SOCK_STREAM)
self.connect( (host, 80) )
self.buffer = 'GET %s HTTP/1.0\r\n\r\n' % path
def handle_connect(self):
pass
def handle_close(self):
self.close()
def handle_read(self):
print self.recv(8192)
def writable(self):
return (len(self.buffer) > 0)
def handle_write(self):
sent = self.send(self.buffer)
self.buffer = self.buffer[sent:]
c = http_client('www.python.org', '/')
asyncore.loop()
==============================================================================
*py2stdlib-atexit*
atexit~
:synopsis: Register and execute cleanup functions.
.. versionadded:: 2.0
The atexit (|py2stdlib-atexit|) module defines a single function to register cleanup
functions. Functions thus registered are automatically executed upon normal
interpreter termination.
Note: the functions registered via this module are not called when the program
is killed by a signal, when a Python fatal internal error is detected, or when
os._exit is called.
.. index:: single: exitfunc (in sys)
This is an alternate interface to the functionality provided by the
``sys.exitfunc`` variable.
Note: This module is unlikely to work correctly when used with other code that
sets ``sys.exitfunc``. In particular, other core Python modules are free to use
atexit (|py2stdlib-atexit|) without the programmer's knowledge. Authors who use
``sys.exitfunc`` should convert their code to use atexit (|py2stdlib-atexit|) instead. The
simplest way to convert code that sets ``sys.exitfunc`` is to import
atexit (|py2stdlib-atexit|) and register the function that had been bound to ``sys.exitfunc``.
register(func[, {args[, }*kargs]])~
Register {func} as a function to be executed at termination. Any optional
arguments that are to be passed to {func} must be passed as arguments to
register.
At normal program termination (for instance, if sys.exit is called or
the main module's execution completes), all functions registered are called in
last in, first out order. The assumption is that lower level modules will
normally be imported before higher level modules and thus must be cleaned up
later.
If an exception is raised during execution of the exit handlers, a traceback is
printed (unless SystemExit is raised) and the exception information is
saved. After all exit handlers have had a chance to run the last exception to
be raised is re-raised.
.. versionchanged:: 2.6
This function now returns {func} which makes it possible to use it as a
decorator without binding the original name to ``None``.
.. seealso::
Module readline (|py2stdlib-readline|)
Useful example of atexit (|py2stdlib-atexit|) to read and write readline (|py2stdlib-readline|) history files.
atexit (|py2stdlib-atexit|) Example
---------------------
The following simple example demonstrates how a module can initialize a counter
from a file when it is imported and save the counter's updated value
automatically when the program terminates without relying on the application
making an explicit call into this module at termination. :: >
try:
_count = int(open("/tmp/counter").read())
except IOError:
_count = 0
def incrcounter(n):
global _count
_count = _count + n
def savecounter():
open("/tmp/counter", "w").write("%d" % _count)
import atexit
atexit.register(savecounter)
<
Positional and keyword arguments may also be passed to register to be
passed along to the registered function when it is called:: >
def goodbye(name, adjective):
print 'Goodbye, %s, it was %s to meet you.' % (name, adjective)
import atexit
atexit.register(goodbye, 'Donny', 'nice')
# or:
atexit.register(goodbye, adjective='nice', name='Donny')
<
Usage as a decorator::
import atexit
@atexit.register
def goodbye():
print "You are now leaving the Python sector."
This obviously only works with functions that don't take arguments.
==============================================================================
*py2stdlib-audioop*
audioop~
:synopsis: Manipulate raw audio data.
The audioop (|py2stdlib-audioop|) module contains some useful operations on sound fragments.
It operates on sound fragments consisting of signed integer samples 8, 16 or 32
bits wide, stored in Python strings. This is the same format as used by the
al (|py2stdlib-al|) and sunaudiodev (|py2stdlib-sunaudiodev|) modules. All scalar items are integers, unless
specified otherwise.
.. index::
single: Intel/DVI ADPCM
single: ADPCM, Intel/DVI
single: a-LAW
single: u-LAW
This module provides support for a-LAW, u-LAW and Intel/DVI ADPCM encodings.
.. This para is mostly here to provide an excuse for the index entries...
A few of the more complicated operations only take 16-bit samples, otherwise the
sample size (in bytes) is always a parameter of the operation.
The module defines the following variables and functions:
error~
This exception is raised on all errors, such as unknown number of bytes per
sample, etc.
add(fragment1, fragment2, width)~
Return a fragment which is the addition of the two samples passed as parameters.
{width} is the sample width in bytes, either ``1``, ``2`` or ``4``. Both
fragments should have the same length.
adpcm2lin(adpcmfragment, width, state)~
Decode an Intel/DVI ADPCM coded fragment to a linear fragment. See the
description of lin2adpcm for details on ADPCM coding. Return a tuple
``(sample, newstate)`` where the sample has the width specified in {width}.
alaw2lin(fragment, width)~
Convert sound fragments in a-LAW encoding to linearly encoded sound fragments.
a-LAW encoding always uses 8 bits samples, so {width} refers only to the sample
width of the output fragment here.
.. versionadded:: 2.5
avg(fragment, width)~
Return the average over all samples in the fragment.
avgpp(fragment, width)~
Return the average peak-peak value over all samples in the fragment. No
filtering is done, so the usefulness of this routine is questionable.
bias(fragment, width, bias)~
Return a fragment that is the original fragment with a bias added to each
sample.
cross(fragment, width)~
Return the number of zero crossings in the fragment passed as an argument.
findfactor(fragment, reference)~
Return a factor {F} such that ``rms(add(fragment, mul(reference, -F)))`` is
minimal, i.e., return the factor with which you should multiply {reference} to
make it match as well as possible to {fragment}. The fragments should both
contain 2-byte samples.
The time taken by this routine is proportional to ``len(fragment)``.
findfit(fragment, reference)~
Try to match {reference} as well as possible to a portion of {fragment} (which
should be the longer fragment). This is (conceptually) done by taking slices
out of {fragment}, using findfactor to compute the best match, and
minimizing the result. The fragments should both contain 2-byte samples.
Return a tuple ``(offset, factor)`` where {offset} is the (integer) offset into
{fragment} where the optimal match started and {factor} is the (floating-point)
factor as per findfactor.
findmax(fragment, length)~
Search {fragment} for a slice of length {length} samples (not bytes!) with
maximum energy, i.e., return {i} for which ``rms(fragment[i{2:(i+length)}2])``
is maximal. The fragments should both contain 2-byte samples.
The routine takes time proportional to ``len(fragment)``.
getsample(fragment, width, index)~
Return the value of sample {index} from the fragment.
lin2adpcm(fragment, width, state)~
Convert samples to 4 bit Intel/DVI ADPCM encoding. ADPCM coding is an adaptive
coding scheme, whereby each 4 bit number is the difference between one sample
and the next, divided by a (varying) step. The Intel/DVI ADPCM algorithm has
been selected for use by the IMA, so it may well become a standard.
{state} is a tuple containing the state of the coder. The coder returns a tuple
``(adpcmfrag, newstate)``, and the {newstate} should be passed to the next call
of lin2adpcm. In the initial call, ``None`` can be passed as the state.
{adpcmfrag} is the ADPCM coded fragment packed 2 4-bit values per byte.
lin2alaw(fragment, width)~
Convert samples in the audio fragment to a-LAW encoding and return this as a
Python string. a-LAW is an audio encoding format whereby you get a dynamic
range of about 13 bits using only 8 bit samples. It is used by the Sun audio
hardware, among others.
.. versionadded:: 2.5
lin2lin(fragment, width, newwidth)~
Convert samples between 1-, 2- and 4-byte formats.
.. note:: >
In some audio formats, such as .WAV files, 16 and 32 bit samples are
signed, but 8 bit samples are unsigned. So when converting to 8 bit wide
samples for these formats, you need to also add 128 to the result::
new_frames = audioop.lin2lin(frames, old_width, 1)
new_frames = audioop.bias(new_frames, 1, 128)
The same, in reverse, has to be applied when converting from 8 to 16 or 32
bit width samples.
<
lin2ulaw(fragment, width)~
Convert samples in the audio fragment to u-LAW encoding and return this as a
Python string. u-LAW is an audio encoding format whereby you get a dynamic
range of about 14 bits using only 8 bit samples. It is used by the Sun audio
hardware, among others.
minmax(fragment, width)~
Return a tuple consisting of the minimum and maximum values of all samples in
the sound fragment.
max(fragment, width)~
Return the maximum of the {absolute value} of all samples in a fragment.
maxpp(fragment, width)~
Return the maximum peak-peak value in the sound fragment.
mul(fragment, width, factor)~
Return a fragment that has all samples in the original fragment multiplied by
the floating-point value {factor}. Overflow is silently ignored.
ratecv(fragment, width, nchannels, inrate, outrate, state[, weightA[, weightB]])~
Convert the frame rate of the input fragment.
{state} is a tuple containing the state of the converter. The converter returns
a tuple ``(newfragment, newstate)``, and {newstate} should be passed to the next
call of ratecv. The initial call should pass ``None`` as the state.
The {weightA} and {weightB} arguments are parameters for a simple digital filter
and default to ``1`` and ``0`` respectively.
reverse(fragment, width)~
Reverse the samples in a fragment and returns the modified fragment.
rms(fragment, width)~
Return the root-mean-square of the fragment, i.e. ``sqrt(sum(S_i^2)/n)``.
This is a measure of the power in an audio signal.
tomono(fragment, width, lfactor, rfactor)~
Convert a stereo fragment to a mono fragment. The left channel is multiplied by
{lfactor} and the right channel by {rfactor} before adding the two channels to
give a mono signal.
tostereo(fragment, width, lfactor, rfactor)~
Generate a stereo fragment from a mono fragment. Each pair of samples in the
stereo fragment are computed from the mono sample, whereby left channel samples
are multiplied by {lfactor} and right channel samples by {rfactor}.
ulaw2lin(fragment, width)~
Convert sound fragments in u-LAW encoding to linearly encoded sound fragments.
u-LAW encoding always uses 8 bits samples, so {width} refers only to the sample
width of the output fragment here.
Note that operations such as .mul or .max make no distinction
between mono and stereo fragments, i.e. all samples are treated equal. If this
is a problem the stereo fragment should be split into two mono fragments first
and recombined later. Here is an example of how to do that:: >
def mul_stereo(sample, width, lfactor, rfactor):
lsample = audioop.tomono(sample, width, 1, 0)
rsample = audioop.tomono(sample, width, 0, 1)
lsample = audioop.mul(sample, width, lfactor)
rsample = audioop.mul(sample, width, rfactor)
lsample = audioop.tostereo(lsample, width, 1, 0)
rsample = audioop.tostereo(rsample, width, 0, 1)
return audioop.add(lsample, rsample, width)
<
If you use the ADPCM coder to build network packets and you want your protocol
to be stateless (i.e. to be able to tolerate packet loss) you should not only
transmit the data but also the state. Note that you should send the {initial}
state (the one you passed to lin2adpcm) along to the decoder, not the
final state (as returned by the coder). If you want to use
struct.struct to store the state in binary you can code the first
element (the predicted value) in 16 bits and the second (the delta index) in 8.
The ADPCM coders have never been tried against other ADPCM coders, only against
themselves. It could well be that I misinterpreted the standards in which case
they will not be interoperable with the respective standards.
The find\* routines might look a bit funny at first sight. They are
primarily meant to do echo cancellation. A reasonably fast way to do this is to
pick the most energetic piece of the output sample, locate that in the input
sample and subtract the whole output sample from the input sample:: >
def echocancel(outputdata, inputdata):
pos = audioop.findmax(outputdata, 800) # one tenth second
out_test = outputdata[pos*2:]
in_test = inputdata[pos*2:]
ipos, factor = audioop.findfit(in_test, out_test)
# Optional (for better cancellation):
# factor = audioop.findfactor(in_test[ipos{2:ipos}2+len(out_test)],
# out_test)
prefill = '\0'{(pos+ipos)}2
postfill = '\0'*(len(inputdata)-len(prefill)-len(outputdata))
outputdata = prefill + audioop.mul(outputdata,2,-factor) + postfill
return audioop.add(inputdata, outputdata, 2)
==============================================================================
*py2stdlib-autogil*
autoGIL~
:platform: Mac
:synopsis: Global Interpreter Lock handling in event loops.
:deprecated:
The autoGIL (|py2stdlib-autogil|) module provides a function installAutoGIL that
automatically locks and unlocks Python's Global Interpreter Lock when
running an event loop.
.. note::
This module has been removed in Python 3.x.
AutoGILError~
Raised if the observer callback cannot be installed, for example because the
current thread does not have a run loop.
installAutoGIL()~
Install an observer callback in the event loop (CFRunLoop) for the current
thread, that will lock and unlock the Global Interpreter Lock (GIL) at
appropriate times, allowing other Python threads to run while the event loop is
idle.
Availability: OSX 10.1 or later.
==============================================================================
*py2stdlib-applesingle*
applesingle~
:platform: Mac
:synopsis: Rudimentary decoder for AppleSingle format files.
:deprecated:
2.6~
buildtools (|py2stdlib-buildtools|) --- Helper module for BuildApplet and Friends
---------------------------------------------------------------
==============================================================================
*py2stdlib-base64*
base64~
:synopsis: RFC 3548: Base16, Base32, Base64 Data Encodings
.. index::
pair: base64; encoding
single: MIME; base64 encoding
This module provides data encoding and decoding as specified in 3548.
This standard defines the Base16, Base32, and Base64 algorithms for encoding and
decoding arbitrary binary strings into text strings that can be safely sent by
email, used as parts of URLs, or included as part of an HTTP POST request. The
encoding algorithm is not the same as the uuencode program.
There are two interfaces provided by this module. The modern interface supports
encoding and decoding string objects using all three alphabets. The legacy
interface provides for encoding and decoding to and from file-like objects as
well as strings, but only using the Base64 standard alphabet.
The modern interface, which was introduced in Python 2.4, provides:
b64encode(s[, altchars])~
Encode a string use Base64.
{s} is the string to encode. Optional {altchars} must be a string of at least
length 2 (additional characters are ignored) which specifies an alternative
alphabet for the ``+`` and ``/`` characters. This allows an application to e.g.
generate URL or filesystem safe Base64 strings. The default is ``None``, for
which the standard Base64 alphabet is used.
The encoded string is returned.
b64decode(s[, altchars])~
Decode a Base64 encoded string.
{s} is the string to decode. Optional {altchars} must be a string of at least
length 2 (additional characters are ignored) which specifies the alternative
alphabet used instead of the ``+`` and ``/`` characters.
The decoded string is returned. A TypeError is raised if {s} were
incorrectly padded or if there are non-alphabet characters present in the
string.
standard_b64encode(s)~
Encode string {s} using the standard Base64 alphabet.
standard_b64decode(s)~
Decode string {s} using the standard Base64 alphabet.
urlsafe_b64encode(s)~
Encode string {s} using a URL-safe alphabet, which substitutes ``-`` instead of
``+`` and ``_`` instead of ``/`` in the standard Base64 alphabet. The result
can still contain ``=``.
urlsafe_b64decode(s)~
Decode string {s} using a URL-safe alphabet, which substitutes ``-`` instead of
``+`` and ``_`` instead of ``/`` in the standard Base64 alphabet.
b32encode(s)~
Encode a string using Base32. {s} is the string to encode. The encoded string
is returned.
b32decode(s[, casefold[, map01]])~
Decode a Base32 encoded string.
{s} is the string to decode. Optional {casefold} is a flag specifying whether a
lowercase alphabet is acceptable as input. For security purposes, the default
is ``False``.
3548 allows for optional mapping of the digit 0 (zero) to the letter O
(oh), and for optional mapping of the digit 1 (one) to either the letter I (eye)
or letter L (el). The optional argument {map01} when not ``None``, specifies
which letter the digit 1 should be mapped to (when {map01} is not ``None``, the
digit 0 is always mapped to the letter O). For security purposes the default is
``None``, so that 0 and 1 are not allowed in the input.
The decoded string is returned. A TypeError is raised if {s} were
incorrectly padded or if there are non-alphabet characters present in the
string.
b16encode(s)~
Encode a string using Base16.
{s} is the string to encode. The encoded string is returned.
b16decode(s[, casefold])~
Decode a Base16 encoded string.
{s} is the string to decode. Optional {casefold} is a flag specifying whether a
lowercase alphabet is acceptable as input. For security purposes, the default
is ``False``.
The decoded string is returned. A TypeError is raised if {s} were
incorrectly padded or if there are non-alphabet characters present in the
string.
The legacy interface:
decode(input, output)~
Decode the contents of the {input} file and write the resulting binary data to
the {output} file. {input} and {output} must either be file objects or objects
that mimic the file object interface. {input} will be read until
``input.read()`` returns an empty string.
decodestring(s)~
Decode the string {s}, which must contain one or more lines of base64 encoded
data, and return a string containing the resulting binary data.
encode(input, output)~
Encode the contents of the {input} file and write the resulting base64 encoded
data to the {output} file. {input} and {output} must either be file objects or
objects that mimic the file object interface. {input} will be read until
``input.read()`` returns an empty string. encode returns the encoded
data plus a trailing newline character (``'\n'``).
encodestring(s)~
Encode the string {s}, which can contain arbitrary binary data, and return a
string containing one or more lines of base64-encoded data.
encodestring returns a string containing one or more lines of
base64-encoded data always including an extra trailing newline (``'\n'``).
An example usage of the module:
>>> import base64
>>> encoded = base64.b64encode('data to be encoded')
>>> encoded
'ZGF0YSB0byBiZSBlbmNvZGVk'
>>> data = base64.b64decode(encoded)
>>> data
'data to be encoded'
.. seealso::
Module binascii (|py2stdlib-binascii|)
Support module containing ASCII-to-binary and binary-to-ASCII conversions.
1521 - MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies
Section 5.2, "Base64 Content-Transfer-Encoding," provides the definition of the
base64 encoding.
==============================================================================
*py2stdlib-basehttpserver*
BaseHTTPServer~
:synopsis: Basic HTTP server (base class for SimpleHTTPServer and CGIHTTPServer).
.. note::
The BaseHTTPServer (|py2stdlib-basehttpserver|) module has been merged into http.server in
Python 3.0. The 2to3 tool will automatically adapt imports when
converting your sources to 3.0.
.. index::
pair: WWW; server
pair: HTTP; protocol
single: URL
single: httpd
module: SimpleHTTPServer
module: CGIHTTPServer
This module defines two classes for implementing HTTP servers (Web servers).
Usually, this module isn't used directly, but is used as a basis for building
functioning Web servers. See the SimpleHTTPServer (|py2stdlib-simplehttpserver|) and
CGIHTTPServer (|py2stdlib-cgihttpserver|) modules.
The first class, HTTPServer, is a SocketServer.TCPServer
subclass, and therefore implements the SocketServer.BaseServer
interface. It creates and listens at the HTTP socket, dispatching the requests
to a handler. Code to create and run the server looks like this:: >
def run(server_class=BaseHTTPServer.HTTPServer,
handler_class=BaseHTTPServer.BaseHTTPRequestHandler):
server_address = ('', 8000)
httpd = server_class(server_address, handler_class)
httpd.serve_forever()
<
HTTPServer(server_address, RequestHandlerClass)~
This class builds on the TCPServer class by storing the server
address as instance variables named server_name and
server_port. The server is accessible by the handler, typically
through the handler's server instance variable.
BaseHTTPRequestHandler(request, client_address, server)~
This class is used to handle the HTTP requests that arrive at the server. By
itself, it cannot respond to any actual HTTP requests; it must be subclassed
to handle each request method (e.g. GET or
POST). BaseHTTPRequestHandler provides a number of class and
instance variables, and methods for use by subclasses.
The handler will parse the request and the headers, then call a method
specific to the request type. The method name is constructed from the
request. For example, for the request method ``SPAM``, the do_SPAM
method will be called with no arguments. All of the relevant information is
stored in instance variables of the handler. Subclasses should not need to
override or extend the __init__ method.
BaseHTTPRequestHandler has the following instance variables:
client_address~
Contains a tuple of the form ``(host, port)`` referring to the client's
address.
server~
Contains the server instance.
command~
Contains the command (request type). For example, ``'GET'``.
path~
Contains the request path.
request_version~
Contains the version string from the request. For example, ``'HTTP/1.0'``.
headers~
Holds an instance of the class specified by the MessageClass class
variable. This instance parses and manages the headers in the HTTP
request.
rfile~
Contains an input stream, positioned at the start of the optional input
data.
wfile~
Contains the output stream for writing a response back to the
client. Proper adherence to the HTTP protocol must be used when writing to
this stream.
BaseHTTPRequestHandler has the following class variables:
server_version~
Specifies the server software version. You may want to override this. The
format is multiple whitespace-separated strings, where each string is of
the form name[/version]. For example, ``'BaseHTTP/0.2'``.
sys_version~
Contains the Python system version, in a form usable by the
version_string method and the server_version class
variable. For example, ``'Python/1.4'``.
error_message_format~
Specifies a format string for building an error response to the client. It
uses parenthesized, keyed format specifiers, so the format operand must be
a dictionary. The {code} key should be an integer, specifying the numeric
HTTP error code value. {message} should be a string containing a
(detailed) error message of what occurred, and {explain} should be an
explanation of the error code number. Default {message} and {explain}
values can found in the {responses} class variable.
error_content_type~
Specifies the Content-Type HTTP header of error responses sent to the
client. The default value is ``'text/html'``.
.. versionadded:: 2.6
Previously, the content type was always ``'text/html'``.
protocol_version~
This specifies the HTTP protocol version used in responses. If set to
``'HTTP/1.1'``, the server will permit HTTP persistent connections;
however, your server {must} then include an accurate ``Content-Length``
header (using send_header) in all of its responses to clients.
For backwards compatibility, the setting defaults to ``'HTTP/1.0'``.
MessageClass~
.. index:: single: Message (in module mimetools)
Specifies a rfc822.Message\ -like class to parse HTTP headers.
Typically, this is not overridden, and it defaults to
mimetools.Message.
responses~
This variable contains a mapping of error code integers to two-element tuples
containing a short and long message. For example, ``{code: (shortmessage,
longmessage)}``. The {shortmessage} is usually used as the {message} key in an
error response, and {longmessage} as the {explain} key (see the
error_message_format class variable).
A BaseHTTPRequestHandler instance has the following methods:
handle()~
Calls handle_one_request once (or, if persistent connections are
enabled, multiple times) to handle incoming HTTP requests. You should
never need to override it; instead, implement appropriate do_\*
methods.
handle_one_request()~
This method will parse and dispatch the request to the appropriate
do_\* method. You should never need to override it.
send_error(code[, message])~
Sends and logs a complete error reply to the client. The numeric {code}
specifies the HTTP error code, with {message} as optional, more specific text. A
complete set of headers is sent, followed by text composed using the
error_message_format class variable.
send_response(code[, message])~
Sends a response header and logs the accepted request. The HTTP response
line is sent, followed by {Server} and {Date} headers. The values for
these two headers are picked up from the version_string and
date_time_string methods, respectively.
send_header(keyword, value)~
Writes a specific HTTP header to the output stream. {keyword} should
specify the header keyword, with {value} specifying its value.
end_headers()~
Sends a blank line, indicating the end of the HTTP headers in the
response.
log_request([code[, size]])~
Logs an accepted (successful) request. {code} should specify the numeric
HTTP code associated with the response. If a size of the response is
available, then it should be passed as the {size} parameter.
log_error(...)~
Logs an error when a request cannot be fulfilled. By default, it passes
the message to log_message, so it takes the same arguments
({format} and additional values).
log_message(format, ...)~
Logs an arbitrary message to ``sys.stderr``. This is typically overridden
to create custom error logging mechanisms. The {format} argument is a
standard printf-style format string, where the additional arguments to
log_message are applied as inputs to the formatting. The client
address and current date and time are prefixed to every message logged.
version_string()~
Returns the server software's version string. This is a combination of the
server_version and sys_version class variables.
date_time_string([timestamp])~
Returns the date and time given by {timestamp} (which must be in the
format returned by time.time), formatted for a message header. If
{timestamp} is omitted, it uses the current date and time.
The result looks like ``'Sun, 06 Nov 1994 08:49:37 GMT'``.
.. versionadded:: 2.5
The {timestamp} parameter.
log_date_time_string()~
Returns the current date and time, formatted for logging.
address_string()~
Returns the client address, formatted for logging. A name lookup is
performed on the client's IP address.
More examples
-------------
To create a server that doesn't run forever, but until some condition is
fulfilled:: >
def run_while_true(server_class=BaseHTTPServer.HTTPServer,
handler_class=BaseHTTPServer.BaseHTTPRequestHandler):
"""
This assumes that keep_running() is a function of no arguments which
is tested initially and after each request. If its return value
is true, the server continues.
"""
server_address = ('', 8000)
httpd = server_class(server_address, handler_class)
while keep_running():
httpd.handle_request()
<
.. seealso::
Module CGIHTTPServer (|py2stdlib-cgihttpserver|)
Extended request handler that supports CGI scripts.
Module SimpleHTTPServer (|py2stdlib-simplehttpserver|)
Basic request handler that limits response to files actually under the
document root.
==============================================================================
*py2stdlib-bastion*
Bastion~
:synopsis: Providing restricted access to objects.
:deprecated:
2.6~
The Bastion (|py2stdlib-bastion|) module has been removed in Python 3.0.
.. versionchanged:: 2.3
Disabled module.
.. note::
The documentation has been left in place to help in reading old code that uses
the module.
According to the dictionary, a bastion is "a fortified area or position", or
"something that is considered a stronghold." It's a suitable name for this
module, which provides a way to forbid access to certain attributes of an
object. It must always be used with the rexec (|py2stdlib-rexec|) module, in order to allow
restricted-mode programs access to certain safe attributes of an object, while
denying access to other, unsafe attributes.
.. I'm concerned that the word 'bastion' won't be understood by people
.. for whom English is a second language, making the module name
.. somewhat mysterious. Thus, the brief definition... --amk
.. I've punted on the issue of documenting keyword arguments for now.
Bastion(object[, filter[, name[, class]]])~
Protect the object {object}, returning a bastion for the object. Any attempt to
access one of the object's attributes will have to be approved by the {filter}
function; if the access is denied an AttributeError exception will be
raised.
If present, {filter} must be a function that accepts a string containing an
attribute name, and returns true if access to that attribute will be permitted;
if {filter} returns false, the access is denied. The default filter denies
access to any function beginning with an underscore (``'_'``). The bastion's
string representation will be ``<Bastion for name>`` if a value for {name} is
provided; otherwise, ``repr(object)`` will be used.
{class}, if present, should be a subclass of BastionClass; see the
code in bastion.py for the details. Overriding the default
BastionClass will rarely be required.
BastionClass(getfunc, name)~
Class which actually implements bastion objects. This is the default class used
by Bastion (|py2stdlib-bastion|). The {getfunc} parameter is a function which returns the
value of an attribute which should be exposed to the restricted execution
environment when called with the name of the attribute as the only parameter.
{name} is used to construct the repr (|py2stdlib-repr|) of the BastionClass
instance.
==============================================================================
*py2stdlib-bdb*
bdb~
:synopsis: Debugger framework.
The bdb (|py2stdlib-bdb|) module handles basic debugger functions, like setting breakpoints
or managing execution via the debugger.
The following exception is defined:
BdbQuit~
Exception raised by the Bdb class for quitting the debugger.
The bdb (|py2stdlib-bdb|) module also defines two classes:
Breakpoint(self, file, line[, temporary=0[, cond=None [, funcname=None]]])~
This class implements temporary breakpoints, ignore counts, disabling and
(re-)enabling, and conditionals.
Breakpoints are indexed by number through a list called bpbynumber
and by ``(file, line)`` pairs through bplist. The former points to a
single instance of class Breakpoint. The latter points to a list of
such instances since there may be more than one breakpoint per line.
When creating a breakpoint, its associated filename should be in canonical
form. If a {funcname} is defined, a breakpoint hit will be counted when the
first line of that function is executed. A conditional breakpoint always
counts a hit.
Breakpoint instances have the following methods:
deleteMe()~
Delete the breakpoint from the list associated to a file/line. If it is
the last breakpoint in that position, it also deletes the entry for the
file/line.
enable()~
Mark the breakpoint as enabled.
disable()~
Mark the breakpoint as disabled.
pprint([out])~
Print all the information about the breakpoint:
* The breakpoint number.
* If it is temporary or not.
* Its file,line position.
* The condition that causes a break.
* If it must be ignored the next N times.
* The breakpoint hit count.
Bdb(skip=None)~
The Bdb class acts as a generic Python debugger base class.
This class takes care of the details of the trace facility; a derived class
should implement user interaction. The standard debugger class
(pdb.Pdb) is an example.
The {skip} argument, if given, must be an iterable of glob-style
module name patterns. The debugger will not step into frames that
originate in a module that matches one of these patterns. Whether a
frame is considered to originate in a certain module is determined
by the ``__name__`` in the frame globals.
.. versionadded:: 2.7
The {skip} argument.
The following methods of Bdb normally don't need to be overridden.
canonic(filename)~
Auxiliary method for getting a filename in a canonical form, that is, as a
case-normalized (on case-insensitive filesystems) absolute path, stripped
of surrounding angle brackets.
reset()~
Set the botframe, stopframe, returnframe and
quitting attributes with values ready to start debugging.
trace_dispatch(frame, event, arg)~
This function is installed as the trace function of debugged frames. Its
return value is the new trace function (in most cases, that is, itself).
The default implementation decides how to dispatch a frame, depending on
the type of event (passed as a string) that is about to be executed.
{event} can be one of the following:
* ``"line"``: A new line of code is going to be executed.
* ``"call"``: A function is about to be called, or another code block
entered.
* ``"return"``: A function or other code block is about to return.
* ``"exception"``: An exception has occurred.
* ``"c_call"``: A C function is about to be called.
* ``"c_return"``: A C function has returned.
* ``"c_exception"``: A C function has thrown an exception.
For the Python events, specialized functions (see below) are called. For
the C events, no action is taken.
The {arg} parameter depends on the previous event.
See the documentation for sys.settrace for more information on the
trace function. For more information on code and frame objects, refer to
types (|py2stdlib-types|).
dispatch_line(frame)~
If the debugger should stop on the current line, invoke the
user_line method (which should be overridden in subclasses).
Raise a BdbQuit exception if the Bdb.quitting flag is set
(which can be set from user_line). Return a reference to the
trace_dispatch method for further tracing in that scope.
dispatch_call(frame, arg)~
If the debugger should stop on this function call, invoke the
user_call method (which should be overridden in subclasses).
Raise a BdbQuit exception if the Bdb.quitting flag is set
(which can be set from user_call). Return a reference to the
trace_dispatch method for further tracing in that scope.
dispatch_return(frame, arg)~
If the debugger should stop on this function return, invoke the
user_return method (which should be overridden in subclasses).
Raise a BdbQuit exception if the Bdb.quitting flag is set
(which can be set from user_return). Return a reference to the
trace_dispatch method for further tracing in that scope.
dispatch_exception(frame, arg)~
If the debugger should stop at this exception, invokes the
user_exception method (which should be overridden in subclasses).
Raise a BdbQuit exception if the Bdb.quitting flag is set
(which can be set from user_exception). Return a reference to the
trace_dispatch method for further tracing in that scope.
Normally derived classes don't override the following methods, but they may
if they want to redefine the definition of stopping and breakpoints.
stop_here(frame)~
This method checks if the {frame} is somewhere below botframe in
the call stack. botframe is the frame in which debugging started.
break_here(frame)~
This method checks if there is a breakpoint in the filename and line
belonging to {frame} or, at least, in the current function. If the
breakpoint is a temporary one, this method deletes it.
break_anywhere(frame)~
This method checks if there is a breakpoint in the filename of the current
frame.
Derived classes should override these methods to gain control over debugger
operation.
user_call(frame, argument_list)~
This method is called from dispatch_call when there is the
possibility that a break might be necessary anywhere inside the called
function.
user_line(frame)~
This method is called from dispatch_line when either
stop_here or break_here yields True.
user_return(frame, return_value)~
This method is called from dispatch_return when stop_here
yields True.
user_exception(frame, exc_info)~
This method is called from dispatch_exception when
stop_here yields True.
do_clear(arg)~
Handle how a breakpoint must be removed when it is a temporary one.
This method must be implemented by derived classes.
Derived classes and clients can call the following methods to affect the
stepping state.
set_step()~
Stop after one line of code.
set_next(frame)~
Stop on the next line in or below the given frame.
set_return(frame)~
Stop when returning from the given frame.
set_until(frame)~
Stop when the line with the line no greater than the current one is
reached or when returning from current frame
set_trace([frame])~
Start debugging from {frame}. If {frame} is not specified, debugging
starts from caller's frame.
set_continue()~
Stop only at breakpoints or when finished. If there are no breakpoints,
set the system trace function to None.
set_quit()~
Set the quitting attribute to True. This raises BdbQuit in
the next call to one of the dispatch_\* methods.
Derived classes and clients can call the following methods to manipulate
breakpoints. These methods return a string containing an error message if
something went wrong, or ``None`` if all is well.
set_break(filename, lineno[, temporary=0[, cond[, funcname]]])~
Set a new breakpoint. If the {lineno} line doesn't exist for the
{filename} passed as argument, return an error message. The {filename}
should be in canonical form, as described in the canonic method.
clear_break(filename, lineno)~
Delete the breakpoints in {filename} and {lineno}. If none were set, an
error message is returned.
clear_bpbynumber(arg)~
Delete the breakpoint which has the index {arg} in the
Breakpoint.bpbynumber. If {arg} is not numeric or out of range,
return an error message.
clear_all_file_breaks(filename)~
Delete all breakpoints in {filename}. If none were set, an error message
is returned.
clear_all_breaks()~
Delete all existing breakpoints.
get_break(filename, lineno)~
Check if there is a breakpoint for {lineno} of {filename}.
get_breaks(filename, lineno)~
Return all breakpoints for {lineno} in {filename}, or an empty list if
none are set.
get_file_breaks(filename)~
Return all breakpoints in {filename}, or an empty list if none are set.
get_all_breaks()~
Return all breakpoints that are set.
Derived classes and clients can call the following methods to get a data
structure representing a stack trace.
get_stack(f, t)~
Get a list of records for a frame and all higher (calling) and lower
frames, and the size of the higher part.
format_stack_entry(frame_lineno, [lprefix=': '])~
Return a string with information about a stack entry, identified by a
``(frame, lineno)`` tuple:
* The canonical form of the filename which contains the frame.
* The function name, or ``"<lambda>"``.
* The input arguments.
* The return value.
* The line of code (if it exists).
The following two methods can be called by clients to use a debugger to debug
a statement, given as a string.
run(cmd, [globals, [locals]])~
Debug a statement executed via the exec statement. {globals}
defaults to __main__.__dict__, {locals} defaults to {globals}.
runeval(expr, [globals, [locals]])~
Debug an expression executed via the eval function. {globals} and
{locals} have the same meaning as in run.
runctx(cmd, globals, locals)~
For backwards compatibility. Calls the run method.
runcall(func, {args, }*kwds)~
Debug a single function call, and return its result.
Finally, the module defines the following functions:
checkfuncname(b, frame)~
Check whether we should break here, depending on the way the breakpoint {b}
was set.
If it was set via line number, it checks if ``b.line`` is the same as the one
in the frame also passed as argument. If the breakpoint was set via function
name, we have to check we are in the right frame (the right function) and if
we are in its first executable line.
effective(file, line, frame)~
Determine if there is an effective (active) breakpoint at this line of code.
Return breakpoint number or 0 if none.
Called only if we know there is a breakpoint at this location. Returns the
breakpoint that was triggered and a flag that indicates if it is ok to delete
a temporary breakpoint.
set_trace()~
Starts debugging with a Bdb instance from caller's frame.
==============================================================================
*py2stdlib-binascii*
binascii~
:synopsis: Tools for converting between binary and various ASCII-encoded binary
representations.
.. index::
module: uu
module: base64
module: binhex
The binascii (|py2stdlib-binascii|) module contains a number of methods to convert between
binary and various ASCII-encoded binary representations. Normally, you will not
use these functions directly but use wrapper modules like uu (|py2stdlib-uu|),
base64 (|py2stdlib-base64|), or binhex (|py2stdlib-binhex|) instead. The binascii (|py2stdlib-binascii|) module contains
low-level functions written in C for greater speed that are used by the
higher-level modules.
The binascii (|py2stdlib-binascii|) module defines the following functions:
a2b_uu(string)~
Convert a single line of uuencoded data back to binary and return the binary
data. Lines normally contain 45 (binary) bytes, except for the last line. Line
data may be followed by whitespace.
b2a_uu(data)~
Convert binary data to a line of ASCII characters, the return value is the
converted line, including a newline char. The length of {data} should be at most
45.
a2b_base64(string)~
Convert a block of base64 data back to binary and return the binary data. More
than one line may be passed at a time.
b2a_base64(data)~
Convert binary data to a line of ASCII characters in base64 coding. The return
value is the converted line, including a newline char. The length of {data}
should be at most 57 to adhere to the base64 standard.
a2b_qp(string[, header])~
Convert a block of quoted-printable data back to binary and return the binary
data. More than one line may be passed at a time. If the optional argument
{header} is present and true, underscores will be decoded as spaces.
b2a_qp(data[, quotetabs, istext, header])~
Convert binary data to a line(s) of ASCII characters in quoted-printable
encoding. The return value is the converted line(s). If the optional argument
{quotetabs} is present and true, all tabs and spaces will be encoded. If the
optional argument {istext} is present and true, newlines are not encoded but
trailing whitespace will be encoded. If the optional argument {header} is
present and true, spaces will be encoded as underscores per RFC1522. If the
optional argument {header} is present and false, newline characters will be
encoded as well; otherwise linefeed conversion might corrupt the binary data
stream.
a2b_hqx(string)~
Convert binhex4 formatted ASCII data to binary, without doing RLE-decompression.
The string should contain a complete number of binary bytes, or (in case of the
last portion of the binhex4 data) have the remaining bits zero.
rledecode_hqx(data)~
Perform RLE-decompression on the data, as per the binhex4 standard. The
algorithm uses ``0x90`` after a byte as a repeat indicator, followed by a count.
A count of ``0`` specifies a byte value of ``0x90``. The routine returns the
decompressed data, unless data input data ends in an orphaned repeat indicator,
in which case the Incomplete exception is raised.
rlecode_hqx(data)~
Perform binhex4 style RLE-compression on {data} and return the result.
b2a_hqx(data)~
Perform hexbin4 binary-to-ASCII translation and return the resulting string. The
argument should already be RLE-coded, and have a length divisible by 3 (except
possibly the last fragment).
crc_hqx(data, crc)~
Compute the binhex4 crc value of {data}, starting with an initial {crc} and
returning the result.
crc32(data[, crc])~
Compute CRC-32, the 32-bit checksum of data, starting with an initial crc. This
is consistent with the ZIP file checksum. Since the algorithm is designed for
use as a checksum algorithm, it is not suitable for use as a general hash
algorithm. Use as follows:: >
print binascii.crc32("hello world")
# Or, in two pieces:
crc = binascii.crc32("hello")
crc = binascii.crc32(" world", crc) & 0xffffffff
print 'crc32 = 0x%08x' % crc
<
.. note::
To generate the same numeric value across all Python versions and
platforms use crc32(data) & 0xffffffff. If you are only using
the checksum in packed binary format this is not necessary as the
return value is the correct 32bit binary representation
regardless of sign.
.. versionchanged:: 2.6
The return value is in the range [-2{31, 2}*31-1]
regardless of platform. In the past the value would be signed on
some platforms and unsigned on others. Use & 0xffffffff on the
value if you want it to match 3.0 behavior.
.. versionchanged:: 3.0
The return value is unsigned and in the range [0, 2{}32-1]
regardless of platform.
b2a_hex(data)~
hexlify(data)
Return the hexadecimal representation of the binary {data}. Every byte of
{data} is converted into the corresponding 2-digit hex representation. The
resulting string is therefore twice as long as the length of {data}.
a2b_hex(hexstr)~
unhexlify(hexstr)
Return the binary data represented by the hexadecimal string {hexstr}. This
function is the inverse of b2a_hex. {hexstr} must contain an even number
of hexadecimal digits (which can be upper or lower case), otherwise a
TypeError is raised.
Error~
Exception raised on errors. These are usually programming errors.
Incomplete~
Exception raised on incomplete data. These are usually not programming errors,
but may be handled by reading a little more data and trying again.
.. seealso::
Module base64 (|py2stdlib-base64|)
Support for base64 encoding used in MIME email messages.
Module binhex (|py2stdlib-binhex|)
Support for the binhex format used on the Macintosh.
Module uu (|py2stdlib-uu|)
Support for UU encoding used on Unix.
Module quopri (|py2stdlib-quopri|)
Support for quoted-printable encoding used in MIME email messages.
==============================================================================
*py2stdlib-binhex*
binhex~
:synopsis: Encode and decode files in binhex4 format.
This module encodes and decodes files in binhex4 format, a format allowing
representation of Macintosh files in ASCII. On the Macintosh, both forks of a
file and the finder information are encoded (or decoded), on other platforms
only the data fork is handled.
.. note::
In Python 3.x, special Macintosh support has been removed.
The binhex (|py2stdlib-binhex|) module defines the following functions:
binhex(input, output)~
Convert a binary file with filename {input} to binhex file {output}. The
{output} parameter can either be a filename or a file-like object (any object
supporting a write and close method).
hexbin(input[, output])~
Decode a binhex file {input}. {input} may be a filename or a file-like object
supporting read and close methods. The resulting file is written
to a file named {output}, unless the argument is omitted in which case the
output filename is read from the binhex file.
The following exception is also defined:
Error~
Exception raised when something can't be encoded using the binhex format (for
example, a filename is too long to fit in the filename field), or when input is
not properly encoded binhex data.
.. seealso::
Module binascii (|py2stdlib-binascii|)
Support module containing ASCII-to-binary and binary-to-ASCII conversions.
Notes
-----
There is an alternative, more powerful interface to the coder and decoder, see
the source for details.
If you code or decode textfiles on non-Macintosh platforms they will still use
the old Macintosh newline convention (carriage-return as end of line).
As of this writing, hexbin appears to not work in all cases.
==============================================================================
*py2stdlib-bisect*
bisect~
:synopsis: Array bisection algorithms for binary searching.
.. example based on the PyModules FAQ entry by Aaron Watters <arw@pythonpros.com>
This module provides support for maintaining a list in sorted order without
having to sort the list after each insertion. For long lists of items with
expensive comparison operations, this can be an improvement over the more common
approach. The module is called bisect (|py2stdlib-bisect|) because it uses a basic bisection
algorithm to do its work. The source code may be most useful as a working
example of the algorithm (the boundary conditions are already right!).
The following functions are provided:
bisect_left(list, item[, lo[, hi]])~
Locate the proper insertion point for {item} in {list} to maintain sorted order.
The parameters {lo} and {hi} may be used to specify a subset of the list which
should be considered; by default the entire list is used. If {item} is already
present in {list}, the insertion point will be before (to the left of) any
existing entries. The return value is suitable for use as the first parameter
to ``list.insert()``. This assumes that {list} is already sorted.
.. versionadded:: 2.1
bisect_right(list, item[, lo[, hi]])~
Similar to bisect_left, but returns an insertion point which comes after
(to the right of) any existing entries of {item} in {list}.
.. versionadded:: 2.1
bisect(...)~
Alias for bisect_right.
insort_left(list, item[, lo[, hi]])~
Insert {item} in {list} in sorted order. This is equivalent to
``list.insert(bisect.bisect_left(list, item, lo, hi), item)``. This assumes
that {list} is already sorted.
.. versionadded:: 2.1
insort_right(list, item[, lo[, hi]])~
Similar to insort_left, but inserting {item} in {list} after any
existing entries of {item}.
.. versionadded:: 2.1
insort(...)~
Alias for insort_right.
Examples
--------
The bisect (|py2stdlib-bisect|) function is generally useful for categorizing numeric data.
This example uses bisect (|py2stdlib-bisect|) to look up a letter grade for an exam total
(say) based on a set of ordered numeric breakpoints: 85 and up is an 'A', 75..84
is a 'B', etc.
>>> grades = "FEDCBA"
>>> breakpoints = [30, 44, 66, 75, 85]
>>> from bisect import bisect
>>> def grade(total):
... return grades[bisect(breakpoints, total)]
...
>>> grade(66)
'C'
>>> map(grade, [33, 99, 77, 44, 12, 88])
['E', 'A', 'B', 'D', 'F', 'A']
Unlike the sorted function, it does not make sense for the bisect (|py2stdlib-bisect|)
functions to have {key} or {reversed} arguments because that would lead to an
inefficent design (successive calls to bisect functions would not "remember"
all of the previous key lookups).
Instead, it is better to search a list of precomputed keys to find the index
of the record in question:: >
>>> data = [('red', 5), ('blue', 1), ('yellow', 8), ('black', 0)]
>>> data.sort(key=lambda r: r[1])
>>> keys = [r[1] for r in data] # precomputed list of keys
>>> data[bisect_left(keys, 0)]
('black', 0)
>>> data[bisect_left(keys, 1)]
('blue', 1)
>>> data[bisect_left(keys, 5)]
('red', 5)
>>> data[bisect_left(keys, 8)]
('yellow', 8)
==============================================================================
*py2stdlib-bsddb*
bsddb~
:synopsis: Interface to Berkeley DB database library
2.6~
The bsddb (|py2stdlib-bsddb|) module has been deprecated for removal in Python 3.0.
The bsddb (|py2stdlib-bsddb|) module provides an interface to the Berkeley DB library. Users
can create hash, btree or record based library files using the appropriate open
call. Bsddb objects behave generally like dictionaries. Keys and values must be
strings, however, so to use other objects as keys or to store other kinds of
objects the user must serialize them somehow, typically using
marshal.dumps or pickle.dumps.
The bsddb (|py2stdlib-bsddb|) module requires a Berkeley DB library version from 4.0 thru
4.7.
.. seealso::
http://www.jcea.es/programacion/pybsddb.htm
The website with documentation for the bsddb.db Python Berkeley DB
interface that closely mirrors the object oriented interface provided in
Berkeley DB 4.x itself.
http://www.oracle.com/database/berkeley-db/
The Berkeley DB library.
A more modern DB, DBEnv and DBSequence object interface is available in the
bsddb.db module which closely matches the Berkeley DB C API documented at
the above URLs. Additional features provided by the bsddb.db API include
fine tuning, transactions, logging, and multiprocess concurrent database access.
The following is a description of the legacy bsddb (|py2stdlib-bsddb|) interface compatible
with the old Python bsddb module. Starting in Python 2.5 this interface should
be safe for multithreaded access. The bsddb.db API is recommended for
threading users as it provides better control.
The bsddb (|py2stdlib-bsddb|) module defines the following functions that create objects that
access the appropriate type of Berkeley DB file. The first two arguments of
each function are the same. For ease of portability, only the first two
arguments should be used in most instances.
hashopen(filename[, flag[, mode[, pgsize[, ffactor[, nelem[, cachesize[, lorder[, hflags]]]]]]]])~
Open the hash format file named {filename}. Files never intended to be
preserved on disk may be created by passing ``None`` as the {filename}. The
optional {flag} identifies the mode used to open the file. It may be ``'r'``
(read only), ``'w'`` (read-write) , ``'c'`` (read-write - create if necessary;
the default) or ``'n'`` (read-write - truncate to zero length). The other
arguments are rarely used and are just passed to the low-level dbopen
function. Consult the Berkeley DB documentation for their use and
interpretation.
btopen(filename[, flag[, mode[, btflags[, cachesize[, maxkeypage[, minkeypage[, pgsize[, lorder]]]]]]]])~
Open the btree format file named {filename}. Files never intended to be
preserved on disk may be created by passing ``None`` as the {filename}. The
optional {flag} identifies the mode used to open the file. It may be ``'r'``
(read only), ``'w'`` (read-write), ``'c'`` (read-write - create if necessary;
the default) or ``'n'`` (read-write - truncate to zero length). The other
arguments are rarely used and are just passed to the low-level dbopen function.
Consult the Berkeley DB documentation for their use and interpretation.
rnopen(filename[, flag[, mode[, rnflags[, cachesize[, pgsize[, lorder[, rlen[, delim[, source[, pad]]]]]]]]]])~
Open a DB record format file named {filename}. Files never intended to be
preserved on disk may be created by passing ``None`` as the {filename}. The
optional {flag} identifies the mode used to open the file. It may be ``'r'``
(read only), ``'w'`` (read-write), ``'c'`` (read-write - create if necessary;
the default) or ``'n'`` (read-write - truncate to zero length). The other
arguments are rarely used and are just passed to the low-level dbopen function.
Consult the Berkeley DB documentation for their use and interpretation.
.. note::
Beginning in 2.3 some Unix versions of Python may have a bsddb185 module.
This is present {only} to allow backwards compatibility with systems which ship
with the old Berkeley DB 1.85 database library. The bsddb185 module
should never be used directly in new code. The module has been removed in
Python 3.0. If you find you still need it look in PyPI.
.. seealso::
Module dbhash (|py2stdlib-dbhash|)
DBM-style interface to the bsddb (|py2stdlib-bsddb|)
Hash, BTree and Record Objects
------------------------------
Once instantiated, hash, btree and record objects support the same methods as
dictionaries. In addition, they support the methods listed below.
.. versionchanged:: 2.3.1
Added dictionary methods.
bsddbobject.close()~
Close the underlying file. The object can no longer be accessed. Since there
is no open open method for these objects, to open the file again a new
bsddb (|py2stdlib-bsddb|) module open function must be called.
bsddbobject.keys()~
Return the list of keys contained in the DB file. The order of the list is
unspecified and should not be relied on. In particular, the order of the list
returned is different for different file formats.
bsddbobject.has_key(key)~
Return ``1`` if the DB file contains the argument as a key.
bsddbobject.set_location(key)~
Set the cursor to the item indicated by {key} and return a tuple containing the
key and its value. For binary tree databases (opened using btopen), if
{key} does not actually exist in the database, the cursor will point to the next
item in sorted order and return that key and value. For other databases,
KeyError will be raised if {key} is not found in the database.
bsddbobject.first()~
Set the cursor to the first item in the DB file and return it. The order of
keys in the file is unspecified, except in the case of B-Tree databases. This
method raises bsddb.error if the database is empty.
bsddbobject.next()~
Set the cursor to the next item in the DB file and return it. The order of
keys in the file is unspecified, except in the case of B-Tree databases.
bsddbobject.previous()~
Set the cursor to the previous item in the DB file and return it. The order of
keys in the file is unspecified, except in the case of B-Tree databases. This
is not supported on hashtable databases (those opened with hashopen).
bsddbobject.last()~
Set the cursor to the last item in the DB file and return it. The order of keys
in the file is unspecified. This is not supported on hashtable databases (those
opened with hashopen). This method raises bsddb.error if the
database is empty.
bsddbobject.sync()~
Synchronize the database on disk.
Example:: >
>>> import bsddb
>>> db = bsddb.btopen('/tmp/spam.db', 'c')
>>> for i in range(10): db['%d'%i] = '%d'% (i*i)
...
>>> db['3']
'9'
>>> db.keys()
['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
>>> db.first()
('0', '0')
>>> db.next()
('1', '1')
>>> db.last()
('9', '81')
>>> db.set_location('2')
('2', '4')
>>> db.previous()
('1', '1')
>>> for k, v in db.iteritems():
... print k, v
0 0
1 1
2 4
3 9
4 16
5 25
6 36
7 49
8 64
9 81
>>> '8' in db
True
>>> db.sync()
0
==============================================================================
*py2stdlib-bz2*
bz2~
:synopsis: Interface to compression and decompression routines compatible with bzip2.
.. versionadded:: 2.3
This module provides a comprehensive interface for the bz2 compression library.
It implements a complete file interface, one-shot (de)compression functions, and
types for sequential (de)compression.
For other archive formats, see the gzip (|py2stdlib-gzip|), zipfile (|py2stdlib-zipfile|), and
tarfile (|py2stdlib-tarfile|) modules.
Here is a summary of the features offered by the bz2 module:
* BZ2File class implements a complete file interface, including
BZ2File.readline, BZ2File.readlines,
BZ2File.writelines, BZ2File.seek, etc;
* BZ2File class implements emulated BZ2File.seek support;
* BZ2File class implements universal newline support;
* BZ2File class offers an optimized line iteration using the readahead
algorithm borrowed from file objects;
* Sequential (de)compression supported by BZ2Compressor and
BZ2Decompressor classes;
* One-shot (de)compression supported by compress and decompress
functions;
* Thread safety uses individual locking mechanism.
(De)compression of files
------------------------
Handling of compressed files is offered by the BZ2File class.
BZ2File(filename[, mode[, buffering[, compresslevel]]])~
Open a bz2 file. Mode can be either ``'r'`` or ``'w'``, for reading (default)
or writing. When opened for writing, the file will be created if it doesn't
exist, and truncated otherwise. If {buffering} is given, ``0`` means
unbuffered, and larger numbers specify the buffer size; the default is
``0``. If {compresslevel} is given, it must be a number between ``1`` and
``9``; the default is ``9``. Add a ``'U'`` to mode to open the file for input
with universal newline support. Any line ending in the input file will be
seen as a ``'\n'`` in Python. Also, a file so opened gains the attribute
newlines; the value for this attribute is one of ``None`` (no newline
read yet), ``'\r'``, ``'\n'``, ``'\r\n'`` or a tuple containing all the
newline types seen. Universal newlines are available only when
reading. Instances support iteration in the same way as normal file
instances.
BZ2File supports the with statement.
.. versionchanged:: 2.7
Support for the with statement was added.
close()~
Close the file. Sets data attribute closed to true. A closed file
cannot be used for further I/O operations. close may be called
more than once without error.
read([size])~
Read at most {size} uncompressed bytes, returned as a string. If the
{size} argument is negative or omitted, read until EOF is reached.
readline([size])~
Return the next line from the file, as a string, retaining newline. A
non-negative {size} argument limits the maximum number of bytes to return
(an incomplete line may be returned then). Return an empty string at EOF.
readlines([size])~
Return a list of lines read. The optional {size} argument, if given, is an
approximate bound on the total number of bytes in the lines returned.
xreadlines()~
For backward compatibility. BZ2File objects now include the
performance optimizations previously implemented in the xreadlines
module.
2.3~
This exists only for compatibility with the method by this name on
file objects, which is deprecated. Use ``for line in file``
instead.
seek(offset[, whence])~
Move to new file position. Argument {offset} is a byte count. Optional
argument {whence} defaults to ``os.SEEK_SET`` or ``0`` (offset from start
of file; offset should be ``>= 0``); other values are ``os.SEEK_CUR`` or
``1`` (move relative to current position; offset can be positive or
negative), and ``os.SEEK_END`` or ``2`` (move relative to end of file;
offset is usually negative, although many platforms allow seeking beyond
the end of a file).
Note that seeking of bz2 files is emulated, and depending on the
parameters the operation may be extremely slow.
tell()~
Return the current file position, an integer (may be a long integer).
write(data)~
Write string {data} to file. Note that due to buffering, close may
be needed before the file on disk reflects the data written.
writelines(sequence_of_strings)~
Write the sequence of strings to the file. Note that newlines are not
added. The sequence can be any iterable object producing strings. This is
equivalent to calling write() for each string.
Sequential (de)compression
--------------------------
Sequential compression and decompression is done using the classes
BZ2Compressor and BZ2Decompressor.
BZ2Compressor([compresslevel])~
Create a new compressor object. This object may be used to compress data
sequentially. If you want to compress data in one shot, use the
compress function instead. The {compresslevel} parameter, if given,
must be a number between ``1`` and ``9``; the default is ``9``.
compress(data)~
Provide more data to the compressor object. It will return chunks of
compressed data whenever possible. When you've finished providing data to
compress, call the flush method to finish the compression process,
and return what is left in internal buffers.
flush()~
Finish the compression process and return what is left in internal
buffers. You must not use the compressor object after calling this method.
BZ2Decompressor()~
Create a new decompressor object. This object may be used to decompress data
sequentially. If you want to decompress data in one shot, use the
decompress function instead.
decompress(data)~
Provide more data to the decompressor object. It will return chunks of
decompressed data whenever possible. If you try to decompress data after
the end of stream is found, EOFError will be raised. If any data
was found after the end of stream, it'll be ignored and saved in
unused_data attribute.
One-shot (de)compression
------------------------
One-shot compression and decompression is provided through the compress
and decompress functions.
compress(data[, compresslevel])~
Compress {data} in one shot. If you want to compress data sequentially, use
an instance of BZ2Compressor instead. The {compresslevel} parameter,
if given, must be a number between ``1`` and ``9``; the default is ``9``.
decompress(data)~
Decompress {data} in one shot. If you want to decompress data sequentially,
use an instance of BZ2Decompressor instead.
==============================================================================
*py2stdlib-buildtools*
buildtools~
:platform: Mac
:synopsis: Helper module for BuildApplet, BuildApplication and macfreeze.
:deprecated:
2.4~
cfmfile (|py2stdlib-cfmfile|) --- Code Fragment Resource module
------------------------------------------------
==============================================================================
*py2stdlib-calendar*
calendar~
:synopsis: Functions for working with calendars, including some emulation of the Unix cal
program.
This module allows you to output calendars like the Unix cal program,
and provides additional useful functions related to the calendar. By default,
these calendars have Monday as the first day of the week, and Sunday as the last
(the European convention). Use setfirstweekday to set the first day of
the week to Sunday (6) or to any other weekday. Parameters that specify dates
are given as integers. For related
functionality, see also the datetime (|py2stdlib-datetime|) and time (|py2stdlib-time|) modules.
Most of these functions and classes rely on the datetime (|py2stdlib-datetime|) module which
uses an idealized calendar, the current Gregorian calendar indefinitely extended
in both directions. This matches the definition of the "proleptic Gregorian"
calendar in Dershowitz and Reingold's book "Calendrical Calculations", where
it's the base calendar for all computations.
Calendar([firstweekday])~
Creates a Calendar object. {firstweekday} is an integer specifying the
first day of the week. ``0`` is Monday (the default), ``6`` is Sunday.
A Calendar object provides several methods that can be used for
preparing the calendar data for formatting. This class doesn't do any formatting
itself. This is the job of subclasses.
.. versionadded:: 2.5
Calendar instances have the following methods:
iterweekdays()~
Return an iterator for the week day numbers that will be used for one
week. The first value from the iterator will be the same as the value of
the firstweekday property.
itermonthdates(year, month)~
Return an iterator for the month {month} (1-12) in the year {year}. This
iterator will return all days (as datetime.date objects) for the
month and all days before the start of the month or after the end of the
month that are required to get a complete week.
itermonthdays2(year, month)~
Return an iterator for the month {month} in the year {year} similar to
itermonthdates. Days returned will be tuples consisting of a day
number and a week day number.
itermonthdays(year, month)~
Return an iterator for the month {month} in the year {year} similar to
itermonthdates. Days returned will simply be day numbers.
monthdatescalendar(year, month)~
Return a list of the weeks in the month {month} of the {year} as full
weeks. Weeks are lists of seven datetime.date objects.
monthdays2calendar(year, month)~
Return a list of the weeks in the month {month} of the {year} as full
weeks. Weeks are lists of seven tuples of day numbers and weekday
numbers.
monthdayscalendar(year, month)~
Return a list of the weeks in the month {month} of the {year} as full
weeks. Weeks are lists of seven day numbers.
yeardatescalendar(year[, width])~
Return the data for the specified year ready for formatting. The return
value is a list of month rows. Each month row contains up to {width}
months (defaulting to 3). Each month contains between 4 and 6 weeks and
each week contains 1--7 days. Days are datetime.date objects.
yeardays2calendar(year[, width])~
Return the data for the specified year ready for formatting (similar to
yeardatescalendar). Entries in the week lists are tuples of day
numbers and weekday numbers. Day numbers outside this month are zero.
yeardayscalendar(year[, width])~
Return the data for the specified year ready for formatting (similar to
yeardatescalendar). Entries in the week lists are day numbers. Day
numbers outside this month are zero.
TextCalendar([firstweekday])~
This class can be used to generate plain text calendars.
.. versionadded:: 2.5
TextCalendar instances have the following methods:
formatmonth(theyear, themonth[, w[, l]])~
Return a month's calendar in a multi-line string. If {w} is provided, it
specifies the width of the date columns, which are centered. If {l} is
given, it specifies the number of lines that each week will use. Depends
on the first weekday as specified in the constructor or set by the
setfirstweekday method.
prmonth(theyear, themonth[, w[, l]])~
Print a month's calendar as returned by formatmonth.
formatyear(theyear[, w[, l[, c[, m]]]])~
Return a {m}-column calendar for an entire year as a multi-line string.
Optional parameters {w}, {l}, and {c} are for date column width, lines per
week, and number of spaces between month columns, respectively. Depends on
the first weekday as specified in the constructor or set by the
setfirstweekday method. The earliest year for which a calendar
can be generated is platform-dependent.
pryear(theyear[, w[, l[, c[, m]]]])~
Print the calendar for an entire year as returned by formatyear.
HTMLCalendar([firstweekday])~
This class can be used to generate HTML calendars.
.. versionadded:: 2.5
HTMLCalendar instances have the following methods:
formatmonth(theyear, themonth[, withyear])~
Return a month's calendar as an HTML table. If {withyear} is true the year
will be included in the header, otherwise just the month name will be
used.
formatyear(theyear[, width])~
Return a year's calendar as an HTML table. {width} (defaulting to 3)
specifies the number of months per row.
formatyearpage(theyear[, width[, css[, encoding]]])~
Return a year's calendar as a complete HTML page. {width} (defaulting to
3) specifies the number of months per row. {css} is the name for the
cascading style sheet to be used. None can be passed if no style
sheet should be used. {encoding} specifies the encoding to be used for the
output (defaulting to the system default encoding).
LocaleTextCalendar([firstweekday[, locale]])~
This subclass of TextCalendar can be passed a locale name in the
constructor and will return month and weekday names in the specified
locale. If this locale includes an encoding all strings containing month and
weekday names will be returned as unicode.
.. versionadded:: 2.5
LocaleHTMLCalendar([firstweekday[, locale]])~
This subclass of HTMLCalendar can be passed a locale name in the
constructor and will return month and weekday names in the specified
locale. If this locale includes an encoding all strings containing month and
weekday names will be returned as unicode.
.. versionadded:: 2.5
For simple text calendars this module provides the following functions.
setfirstweekday(weekday)~
Sets the weekday (``0`` is Monday, ``6`` is Sunday) to start each week. The
values MONDAY, TUESDAY, WEDNESDAY, THURSDAY,
FRIDAY, SATURDAY, and SUNDAY are provided for
convenience. For example, to set the first weekday to Sunday:: >
import calendar
calendar.setfirstweekday(calendar.SUNDAY)
<
.. versionadded:: 2.0
firstweekday()~
Returns the current setting for the weekday to start each week.
.. versionadded:: 2.0
isleap(year)~
Returns True if {year} is a leap year, otherwise False.
leapdays(y1, y2)~
Returns the number of leap years in the range from {y1} to {y2} (exclusive),
where {y1} and {y2} are years.
.. versionchanged:: 2.0
This function didn't work for ranges spanning a century change in Python
1.5.2.
weekday(year, month, day)~
Returns the day of the week (``0`` is Monday) for {year} (``1970``--...),
{month} (``1``--``12``), {day} (``1``--``31``).
weekheader(n)~
Return a header containing abbreviated weekday names. {n} specifies the width in
characters for one weekday.
monthrange(year, month)~
Returns weekday of first day of the month and number of days in month, for the
specified {year} and {month}.
monthcalendar(year, month)~
Returns a matrix representing a month's calendar. Each row represents a week;
days outside of the month a represented by zeros. Each week begins with Monday
unless set by setfirstweekday.
prmonth(theyear, themonth[, w[, l]])~
Prints a month's calendar as returned by month.
month(theyear, themonth[, w[, l]])~
Returns a month's calendar in a multi-line string using the formatmonth
of the TextCalendar class.
.. versionadded:: 2.0
prcal(year[, w[, l[c]]])~
Prints the calendar for an entire year as returned by calendar (|py2stdlib-calendar|).
calendar(year[, w[, l[c]]])~
Returns a 3-column calendar for an entire year as a multi-line string using the
formatyear of the TextCalendar class.
.. versionadded:: 2.0
timegm(tuple)~
An unrelated but handy function that takes a time tuple such as returned by the
gmtime function in the time (|py2stdlib-time|) module, and returns the corresponding
Unix timestamp value, assuming an epoch of 1970, and the POSIX encoding. In
fact, time.gmtime and timegm are each others' inverse.
.. versionadded:: 2.0
The calendar (|py2stdlib-calendar|) module exports the following data attributes:
day_name~
An array that represents the days of the week in the current locale.
day_abbr~
An array that represents the abbreviated days of the week in the current locale.
month_name~
An array that represents the months of the year in the current locale. This
follows normal convention of January being month number 1, so it has a length of
13 and ``month_name[0]`` is the empty string.
month_abbr~
An array that represents the abbreviated months of the year in the current
locale. This follows normal convention of January being month number 1, so it
has a length of 13 and ``month_abbr[0]`` is the empty string.
.. seealso::
Module datetime (|py2stdlib-datetime|)
Object-oriented interface to dates and times with similar functionality to the
time (|py2stdlib-time|) module.
Module time (|py2stdlib-time|)
Low-level time related functions.
==============================================================================
*py2stdlib-carbon.ae*
Carbon.AE~
:platform: Mac
:synopsis: Interface to the Apple Events toolbox.
:deprecated:
Carbon.AH (|py2stdlib-carbon.ah|) --- Apple Help
===============================
==============================================================================
*py2stdlib-carbon.ah*
Carbon.AH~
:platform: Mac
:synopsis: Interface to the Apple Help manager.
:deprecated:
Carbon.App (|py2stdlib-carbon.app|) --- Appearance Manager
========================================
==============================================================================
*py2stdlib-carbon.app*
Carbon.App~
:platform: Mac
:synopsis: Interface to the Appearance Manager.
:deprecated:
Carbon.Appearance (|py2stdlib-carbon.appearance|) --- Appearance Manager constants
=========================================================
==============================================================================
*py2stdlib-carbon.appearance*
Carbon.Appearance~
:platform: Mac
:synopsis: Constant definitions for the interface to the Appearance Manager.
:deprecated:
Carbon.CF (|py2stdlib-carbon.cf|) --- Core Foundation
====================================
==============================================================================
*py2stdlib-carbon.cf*
Carbon.CF~
:platform: Mac
:synopsis: Interface to the Core Foundation.
:deprecated:
The ``CFBase``, ``CFArray``, ``CFData``, ``CFDictionary``, ``CFString`` and
``CFURL`` objects are supported, some only partially.
Carbon.CG (|py2stdlib-carbon.cg|) --- Core Graphics
==================================
==============================================================================
*py2stdlib-carbon.cg*
Carbon.CG~
:platform: Mac
:synopsis: Interface to Core Graphics.
:deprecated:
Carbon.CarbonEvt (|py2stdlib-carbon.carbonevt|) --- Carbon Event Manager
================================================
==============================================================================
*py2stdlib-carbon.carbonevt*
Carbon.CarbonEvt~
:platform: Mac
:synopsis: Interface to the Carbon Event Manager.
:deprecated:
Carbon.CarbonEvents (|py2stdlib-carbon.carbonevents|) --- Carbon Event Manager constants
=============================================================
==============================================================================
*py2stdlib-carbon.carbonevents*
Carbon.CarbonEvents~
:platform: Mac
:synopsis: Constants for the interface to the Carbon Event Manager.
:deprecated:
Carbon.Cm (|py2stdlib-carbon.cm|) --- Component Manager
======================================
==============================================================================
*py2stdlib-carbon.cm*
Carbon.Cm~
:platform: Mac
:synopsis: Interface to the Component Manager.
:deprecated:
Carbon.Components (|py2stdlib-carbon.components|) --- Component Manager constants
========================================================
==============================================================================
*py2stdlib-carbon.components*
Carbon.Components~
:platform: Mac
:synopsis: Constants for the interface to the Component Manager.
:deprecated:
Carbon.ControlAccessor (|py2stdlib-carbon.controlaccessor|) --- Control Manager accssors
==========================================================
==============================================================================
*py2stdlib-carbon.controlaccessor*
Carbon.ControlAccessor~
:platform: Mac
:synopsis: Accessor functions for the interface to the Control Manager.
:deprecated:
Carbon.Controls (|py2stdlib-carbon.controls|) --- Control Manager constants
====================================================
==============================================================================
*py2stdlib-carbon.controls*
Carbon.Controls~
:platform: Mac
:synopsis: Constants for the interface to the Control Manager.
:deprecated:
Carbon.CoreFounation (|py2stdlib-carbon.corefounation|) --- CoreFounation constants
=======================================================
==============================================================================
*py2stdlib-carbon.corefounation*
Carbon.CoreFounation~
:platform: Mac
:synopsis: Constants for the interface to CoreFoundation.
:deprecated:
Carbon.CoreGraphics (|py2stdlib-carbon.coregraphics|) --- CoreGraphics constants
=====================================================
==============================================================================
*py2stdlib-carbon.coregraphics*
Carbon.CoreGraphics~
:platform: Mac
:synopsis: Constants for the interface to CoreGraphics.
:deprecated:
Carbon.Ctl (|py2stdlib-carbon.ctl|) --- Control Manager
=====================================
==============================================================================
*py2stdlib-carbon.ctl*
Carbon.Ctl~
:platform: Mac
:synopsis: Interface to the Control Manager.
:deprecated:
Carbon.Dialogs (|py2stdlib-carbon.dialogs|) --- Dialog Manager constants
==================================================
==============================================================================
*py2stdlib-carbon.dialogs*
Carbon.Dialogs~
:platform: Mac
:synopsis: Constants for the interface to the Dialog Manager.
:deprecated:
Carbon.Dlg (|py2stdlib-carbon.dlg|) --- Dialog Manager
====================================
==============================================================================
*py2stdlib-carbon.dlg*
Carbon.Dlg~
:platform: Mac
:synopsis: Interface to the Dialog Manager.
:deprecated:
Carbon.Drag (|py2stdlib-carbon.drag|) --- Drag and Drop Manager
============================================
==============================================================================
*py2stdlib-carbon.drag*
Carbon.Drag~
:platform: Mac
:synopsis: Interface to the Drag and Drop Manager.
:deprecated:
Carbon.Dragconst (|py2stdlib-carbon.dragconst|) --- Drag and Drop Manager constants
===========================================================
==============================================================================
*py2stdlib-carbon.dragconst*
Carbon.Dragconst~
:platform: Mac
:synopsis: Constants for the interface to the Drag and Drop Manager.
:deprecated:
Carbon.Events (|py2stdlib-carbon.events|) --- Event Manager constants
================================================
==============================================================================
*py2stdlib-carbon.events*
Carbon.Events~
:platform: Mac
:synopsis: Constants for the interface to the classic Event Manager.
:deprecated:
Carbon.Evt (|py2stdlib-carbon.evt|) --- Event Manager
===================================
==============================================================================
*py2stdlib-carbon.evt*
Carbon.Evt~
:platform: Mac
:synopsis: Interface to the classic Event Manager.
:deprecated:
Carbon.File (|py2stdlib-carbon.file|) --- File Manager
===================================
==============================================================================
*py2stdlib-carbon.file*
Carbon.File~
:platform: Mac
:synopsis: Interface to the File Manager.
:deprecated:
Carbon.Files (|py2stdlib-carbon.files|) --- File Manager constants
==============================================
==============================================================================
*py2stdlib-carbon.files*
Carbon.Files~
:platform: Mac
:synopsis: Constants for the interface to the File Manager.
:deprecated:
Carbon.Fm (|py2stdlib-carbon.fm|) --- Font Manager
=================================
==============================================================================
*py2stdlib-carbon.fm*
Carbon.Fm~
:platform: Mac
:synopsis: Interface to the Font Manager.
:deprecated:
Carbon.Folder (|py2stdlib-carbon.folder|) --- Folder Manager
=======================================
==============================================================================
*py2stdlib-carbon.folder*
Carbon.Folder~
:platform: Mac
:synopsis: Interface to the Folder Manager.
:deprecated:
Carbon.Folders (|py2stdlib-carbon.folders|) --- Folder Manager constants
==================================================
==============================================================================
*py2stdlib-carbon.folders*
Carbon.Folders~
:platform: Mac
:synopsis: Constants for the interface to the Folder Manager.
:deprecated:
Carbon.Fonts (|py2stdlib-carbon.fonts|) --- Font Manager constants
==============================================
==============================================================================
*py2stdlib-carbon.fonts*
Carbon.Fonts~
:platform: Mac
:synopsis: Constants for the interface to the Font Manager.
:deprecated:
Carbon.Help (|py2stdlib-carbon.help|) --- Help Manager
===================================
==============================================================================
*py2stdlib-carbon.help*
Carbon.Help~
:platform: Mac
:synopsis: Interface to the Carbon Help Manager.
:deprecated:
Carbon.IBCarbon (|py2stdlib-carbon.ibcarbon|) --- Carbon InterfaceBuilder
==================================================
==============================================================================
*py2stdlib-carbon.ibcarbon*
Carbon.IBCarbon~
:platform: Mac
:synopsis: Interface to the Carbon InterfaceBuilder support libraries.
:deprecated:
Carbon.IBCarbonRuntime (|py2stdlib-carbon.ibcarbonruntime|) --- Carbon InterfaceBuilder constants
===================================================================
==============================================================================
*py2stdlib-carbon.ibcarbonruntime*
Carbon.IBCarbonRuntime~
:platform: Mac
:synopsis: Constants for the interface to the Carbon InterfaceBuilder support libraries.
:deprecated:
Carbon.Icn --- Carbon Icon Manager
=========================================
==============================================================================
*py2stdlib-carbon.icns*
Carbon.Icns~
:platform: Mac
:synopsis: Interface to the Carbon Icon Manager
:deprecated:
Carbon.Icons (|py2stdlib-carbon.icons|) --- Carbon Icon Manager constants
=====================================================
==============================================================================
*py2stdlib-carbon.icons*
Carbon.Icons~
:platform: Mac
:synopsis: Constants for the interface to the Carbon Icon Manager
:deprecated:
Carbon.Launch (|py2stdlib-carbon.launch|) --- Carbon Launch Services
===============================================
==============================================================================
*py2stdlib-carbon.launch*
Carbon.Launch~
:platform: Mac
:synopsis: Interface to the Carbon Launch Services.
:deprecated:
Carbon.LaunchServices (|py2stdlib-carbon.launchservices|) --- Carbon Launch Services constants
=================================================================
==============================================================================
*py2stdlib-carbon.launchservices*
Carbon.LaunchServices~
:platform: Mac
:synopsis: Constants for the interface to the Carbon Launch Services.
:deprecated:
Carbon.List (|py2stdlib-carbon.list|) --- List Manager
===================================
==============================================================================
*py2stdlib-carbon.list*
Carbon.List~
:platform: Mac
:synopsis: Interface to the List Manager.
:deprecated:
Carbon.Lists (|py2stdlib-carbon.lists|) --- List Manager constants
==============================================
==============================================================================
*py2stdlib-carbon.lists*
Carbon.Lists~
:platform: Mac
:synopsis: Constants for the interface to the List Manager.
:deprecated:
Carbon.MacHelp (|py2stdlib-carbon.machelp|) --- Help Manager constants
================================================
==============================================================================
*py2stdlib-carbon.machelp*
Carbon.MacHelp~
:platform: Mac
:synopsis: Constants for the interface to the Carbon Help Manager.
:deprecated:
Carbon.MediaDescr (|py2stdlib-carbon.mediadescr|) --- Parsers and generators for Quicktime Media descriptors
===================================================================================
==============================================================================
*py2stdlib-carbon.mediadescr*
Carbon.MediaDescr~
:platform: Mac
:synopsis: Parsers and generators for Quicktime Media descriptors
:deprecated:
Carbon.Menu (|py2stdlib-carbon.menu|) --- Menu Manager
===================================
==============================================================================
*py2stdlib-carbon.menu*
Carbon.Menu~
:platform: Mac
:synopsis: Interface to the Menu Manager.
:deprecated:
Carbon.Menus (|py2stdlib-carbon.menus|) --- Menu Manager constants
==============================================
==============================================================================
*py2stdlib-carbon.menus*
Carbon.Menus~
:platform: Mac
:synopsis: Constants for the interface to the Menu Manager.
:deprecated:
Carbon.Mlte (|py2stdlib-carbon.mlte|) --- MultiLingual Text Editor
===============================================
==============================================================================
*py2stdlib-carbon.mlte*
Carbon.Mlte~
:platform: Mac
:synopsis: Interface to the MultiLingual Text Editor.
:deprecated:
Carbon.OSA (|py2stdlib-carbon.osa|) --- Carbon OSA Interface
==========================================
==============================================================================
*py2stdlib-carbon.osa*
Carbon.OSA~
:platform: Mac
:synopsis: Interface to the Carbon OSA Library.
:deprecated:
Carbon.OSAconst (|py2stdlib-carbon.osaconst|) --- Carbon OSA Interface constants
=========================================================
==============================================================================
*py2stdlib-carbon.osaconst*
Carbon.OSAconst~
:platform: Mac
:synopsis: Constants for the interface to the Carbon OSA Library.
:deprecated:
Carbon.QDOffscreen (|py2stdlib-carbon.qdoffscreen|) --- QuickDraw Offscreen constants
===========================================================
==============================================================================
*py2stdlib-carbon.qdoffscreen*
Carbon.QDOffscreen~
:platform: Mac
:synopsis: Constants for the interface to the QuickDraw Offscreen APIs.
:deprecated:
Carbon.Qd (|py2stdlib-carbon.qd|) --- QuickDraw
==============================
==============================================================================
*py2stdlib-carbon.qd*
Carbon.Qd~
:platform: Mac
:synopsis: Interface to the QuickDraw toolbox.
:deprecated:
Carbon.Qdoffs (|py2stdlib-carbon.qdoffs|) --- QuickDraw Offscreen
============================================
==============================================================================
*py2stdlib-carbon.qdoffs*
Carbon.Qdoffs~
:platform: Mac
:synopsis: Interface to the QuickDraw Offscreen APIs.
:deprecated:
Carbon.Qt (|py2stdlib-carbon.qt|) --- QuickTime
==============================
==============================================================================
*py2stdlib-carbon.qt*
Carbon.Qt~
:platform: Mac
:synopsis: Interface to the QuickTime toolbox.
:deprecated:
Carbon.QuickDraw (|py2stdlib-carbon.quickdraw|) --- QuickDraw constants
===============================================
==============================================================================
*py2stdlib-carbon.quickdraw*
Carbon.QuickDraw~
:platform: Mac
:synopsis: Constants for the interface to the QuickDraw toolbox.
:deprecated:
Carbon.QuickTime (|py2stdlib-carbon.quicktime|) --- QuickTime constants
===============================================
==============================================================================
*py2stdlib-carbon.quicktime*
Carbon.QuickTime~
:platform: Mac
:synopsis: Constants for the interface to the QuickTime toolbox.
:deprecated:
Carbon.Res (|py2stdlib-carbon.res|) --- Resource Manager and Handles
==================================================
==============================================================================
*py2stdlib-carbon.res*
Carbon.Res~
:platform: Mac
:synopsis: Interface to the Resource Manager and Handles.
:deprecated:
Carbon.Resources (|py2stdlib-carbon.resources|) --- Resource Manager and Handles constants
==================================================================
==============================================================================
*py2stdlib-carbon.resources*
Carbon.Resources~
:platform: Mac
:synopsis: Constants for the interface to the Resource Manager and Handles.
:deprecated:
Carbon.Scrap (|py2stdlib-carbon.scrap|) --- Scrap Manager
=====================================
==============================================================================
*py2stdlib-carbon.scrap*
Carbon.Scrap~
:platform: Mac
:synopsis: The Scrap Manager provides basic services for implementing cut & paste and
clipboard operations.
:deprecated:
This module is only fully available on Mac OS 9 and earlier under classic PPC
MacPython. Very limited functionality is available under Carbon MacPython.
.. index:: single: Scrap Manager
The Scrap Manager supports the simplest form of cut & paste operations on the
Macintosh. It can be use for both inter- and intra-application clipboard
operations.
The Scrap module provides low-level access to the functions of the Scrap
Manager. It contains the following functions:
InfoScrap()~
Return current information about the scrap. The information is encoded as a
tuple containing the fields ``(size, handle, count, state, path)``.
+----------+---------------------------------------------+
| Field | Meaning |
+==========+=============================================+
| {size} | Size of the scrap in bytes. |
+----------+---------------------------------------------+
| {handle} | Resource object representing the scrap. |
+----------+---------------------------------------------+
| {count} | Serial number of the scrap contents. |
+----------+---------------------------------------------+
| {state} | Integer; positive if in memory, ``0`` if on |
| | disk, negative if uninitialized. |
+----------+---------------------------------------------+
| {path} | Filename of the scrap when stored on disk. |
+----------+---------------------------------------------+
.. seealso::
`Scrap Manager <http://developer.apple.com/documentation/mac/MoreToolbox/MoreToolbox-109.html>`_
Apple's documentation for the Scrap Manager gives a lot of useful information
about using the Scrap Manager in applications.
Carbon.Snd (|py2stdlib-carbon.snd|) --- Sound Manager
===================================
==============================================================================
*py2stdlib-carbon.snd*
Carbon.Snd~
:platform: Mac
:synopsis: Interface to the Sound Manager.
:deprecated:
Carbon.Sound (|py2stdlib-carbon.sound|) --- Sound Manager constants
===============================================
==============================================================================
*py2stdlib-carbon.sound*
Carbon.Sound~
:platform: Mac
:synopsis: Constants for the interface to the Sound Manager.
:deprecated:
Carbon.TE (|py2stdlib-carbon.te|) --- TextEdit
=============================
==============================================================================
*py2stdlib-carbon.te*
Carbon.TE~
:platform: Mac
:synopsis: Interface to TextEdit.
:deprecated:
Carbon.TextEdit (|py2stdlib-carbon.textedit|) --- TextEdit constants
=============================================
==============================================================================
*py2stdlib-carbon.textedit*
Carbon.TextEdit~
:platform: Mac
:synopsis: Constants for the interface to TextEdit.
:deprecated:
Carbon.Win (|py2stdlib-carbon.win|) --- Window Manager
====================================
==============================================================================
*py2stdlib-carbon.win*
Carbon.Win~
:platform: Mac
:synopsis: Interface to the Window Manager.
:deprecated:
Carbon.Windows (|py2stdlib-carbon.windows|) --- Window Manager constants
==================================================
==============================================================================
*py2stdlib-carbon.windows*
Carbon.Windows~
:platform: Mac
:synopsis: Constants for the interface to the Window Manager.
:deprecated:
==============================================================================
*py2stdlib-cd*
cd~
:platform: IRIX
:synopsis: Interface to the CD-ROM on Silicon Graphics systems.
:deprecated:
2.6~
The cd (|py2stdlib-cd|) module has been deprecated for removal in Python 3.0.
This module provides an interface to the Silicon Graphics CD library. It is
available only on Silicon Graphics systems.
The way the library works is as follows. A program opens the CD-ROM device with
.open and creates a parser to parse the data from the CD with
createparser. The object returned by .open can be used to read
data from the CD, but also to get status information for the CD-ROM device, and
to get information about the CD, such as the table of contents. Data from the
CD is passed to the parser, which parses the frames, and calls any callback
functions that have previously been added.
An audio CD is divided into tracks or programs (the terms are used
interchangeably). Tracks can be subdivided into indices. An audio CD
contains a table of contents which gives the starts of the tracks on the
CD. Index 0 is usually the pause before the start of a track. The start of the
track as given by the table of contents is normally the start of index 1.
Positions on a CD can be represented in two ways. Either a frame number or a
tuple of three values, minutes, seconds and frames. Most functions use the
latter representation. Positions can be both relative to the beginning of the
CD, and to the beginning of the track.
Module cd (|py2stdlib-cd|) defines the following functions and constants:
createparser()~
Create and return an opaque parser object. The methods of the parser object are
described below.
msftoframe(minutes, seconds, frames)~
Converts a ``(minutes, seconds, frames)`` triple representing time in absolute
time code into the corresponding CD frame number.
open([device[, mode]])~
Open the CD-ROM device. The return value is an opaque player object; methods of
the player object are described below. The device is the name of the SCSI
device file, e.g. ``'/dev/scsi/sc0d4l0'``, or ``None``. If omitted or ``None``,
the hardware inventory is consulted to locate a CD-ROM drive. The {mode}, if
not omitted, should be the string ``'r'``.
The module defines the following variables:
error~
Exception raised on various errors.
DATASIZE~
The size of one frame's worth of audio data. This is the size of the audio data
as passed to the callback of type ``audio``.
BLOCKSIZE~
The size of one uninterpreted frame of audio data.
The following variables are states as returned by getstatus:
READY~
The drive is ready for operation loaded with an audio CD.
NODISC~
The drive does not have a CD loaded.
CDROM~
The drive is loaded with a CD-ROM. Subsequent play or read operations will
return I/O errors.
ERROR~
An error occurred while trying to read the disc or its table of contents.
PLAYING~
The drive is in CD player mode playing an audio CD through its audio jacks.
PAUSED~
The drive is in CD layer mode with play paused.
STILL~
The equivalent of PAUSED on older (non 3301) model Toshiba CD-ROM
drives. Such drives have never been shipped by SGI.
audio~
pnum
index
ptime
atime
catalog
ident
control
Integer constants describing the various types of parser callbacks that can be
set by the addcallback method of CD parser objects (see below).
Player Objects
--------------
Player objects (returned by .open) have the following methods:
CD player.allowremoval()~
Unlocks the eject button on the CD-ROM drive permitting the user to eject the
caddy if desired.
CD player.bestreadsize()~
Returns the best value to use for the {num_frames} parameter of the
readda method. Best is defined as the value that permits a continuous
flow of data from the CD-ROM drive.
CD player.close()~
Frees the resources associated with the player object. After calling
close, the methods of the object should no longer be used.
CD player.eject()~
Ejects the caddy from the CD-ROM drive.
CD player.getstatus()~
Returns information pertaining to the current state of the CD-ROM drive. The
returned information is a tuple with the following values: {state}, {track},
{rtime}, {atime}, {ttime}, {first}, {last}, {scsi_audio}, {cur_block}. {rtime}
is the time relative to the start of the current track; {atime} is the time
relative to the beginning of the disc; {ttime} is the total time on the disc.
For more information on the meaning of the values, see the man page
CDgetstatus(3dm). The value of {state} is one of the following:
ERROR, NODISC, READY, PLAYING,
PAUSED, STILL, or CDROM.
CD player.gettrackinfo(track)~
Returns information about the specified track. The returned information is a
tuple consisting of two elements, the start time of the track and the duration
of the track.
CD player.msftoblock(min, sec, frame)~
Converts a minutes, seconds, frames triple representing a time in absolute time
code into the corresponding logical block number for the given CD-ROM drive.
You should use msftoframe rather than msftoblock for comparing
times. The logical block number differs from the frame number by an offset
required by certain CD-ROM drives.
CD player.play(start, play)~
Starts playback of an audio CD in the CD-ROM drive at the specified track. The
audio output appears on the CD-ROM drive's headphone and audio jacks (if
fitted). Play stops at the end of the disc. {start} is the number of the track
at which to start playing the CD; if {play} is 0, the CD will be set to an
initial paused state. The method togglepause can then be used to
commence play.
CD player.playabs(minutes, seconds, frames, play)~
Like play, except that the start is given in minutes, seconds, and
frames instead of a track number.
CD player.playtrack(start, play)~
Like play, except that playing stops at the end of the track.
CD player.playtrackabs(track, minutes, seconds, frames, play)~
Like play, except that playing begins at the specified absolute time and
ends at the end of the specified track.
CD player.preventremoval()~
Locks the eject button on the CD-ROM drive thus preventing the user from
arbitrarily ejecting the caddy.
CD player.readda(num_frames)~
Reads the specified number of frames from an audio CD mounted in the CD-ROM
drive. The return value is a string representing the audio frames. This string
can be passed unaltered to the parseframe method of the parser object.
CD player.seek(minutes, seconds, frames)~
Sets the pointer that indicates the starting point of the next read of digital
audio data from a CD-ROM. The pointer is set to an absolute time code location
specified in {minutes}, {seconds}, and {frames}. The return value is the
logical block number to which the pointer has been set.
CD player.seekblock(block)~
Sets the pointer that indicates the starting point of the next read of digital
audio data from a CD-ROM. The pointer is set to the specified logical block
number. The return value is the logical block number to which the pointer has
been set.
CD player.seektrack(track)~
Sets the pointer that indicates the starting point of the next read of digital
audio data from a CD-ROM. The pointer is set to the specified track. The
return value is the logical block number to which the pointer has been set.
CD player.stop()~
Stops the current playing operation.
CD player.togglepause()~
Pauses the CD if it is playing, and makes it play if it is paused.
Parser Objects
--------------
Parser objects (returned by createparser) have the following methods:
CD parser.addcallback(type, func, arg)~
Adds a callback for the parser. The parser has callbacks for eight different
types of data in the digital audio data stream. Constants for these types are
defined at the cd (|py2stdlib-cd|) module level (see above). The callback is called as
follows: ``func(arg, type, data)``, where {arg} is the user supplied argument,
{type} is the particular type of callback, and {data} is the data returned for
this {type} of callback. The type of the data depends on the {type} of callback
as follows:
+-------------+---------------------------------------------+
| Type | Value |
+=============+=============================================+
| ``audio`` | String which can be passed unmodified to |
| | al.writesamps. |
+-------------+---------------------------------------------+
| ``pnum`` | Integer giving the program (track) number. |
+-------------+---------------------------------------------+
| ``index`` | Integer giving the index number. |
+-------------+---------------------------------------------+
| ``ptime`` | Tuple consisting of the program time in |
| | minutes, seconds, and frames. |
+-------------+---------------------------------------------+
| ``atime`` | Tuple consisting of the absolute time in |
| | minutes, seconds, and frames. |
+-------------+---------------------------------------------+
| ``catalog`` | String of 13 characters, giving the catalog |
| | number of the CD. |
+-------------+---------------------------------------------+
| ``ident`` | String of 12 characters, giving the ISRC |
| | identification number of the recording. |
| | The string consists of two characters |
| | country code, three characters owner code, |
| | two characters giving the year, and five |
| | characters giving a serial number. |
+-------------+---------------------------------------------+
| ``control`` | Integer giving the control bits from the CD |
| | subcode data |
+-------------+---------------------------------------------+
CD parser.deleteparser()~
Deletes the parser and frees the memory it was using. The object should not be
used after this call. This call is done automatically when the last reference
to the object is removed.
CD parser.parseframe(frame)~
Parses one or more frames of digital audio data from a CD such as returned by
readda. It determines which subcodes are present in the data. If these
subcodes have changed since the last frame, then parseframe executes a
callback of the appropriate type passing to it the subcode data found in the
frame. Unlike the C function, more than one frame of digital audio data can be
passed to this method.
CD parser.removecallback(type)~
Removes the callback for the given {type}.
CD parser.resetparser()~
Resets the fields of the parser used for tracking subcodes to an initial state.
resetparser should be called after the disc has been changed.
==============================================================================
*py2stdlib-cgi*
cgi~
:synopsis: Helpers for running Python scripts via the Common Gateway Interface.
.. index::
pair: WWW; server
pair: CGI; protocol
pair: HTTP; protocol
pair: MIME; headers
single: URL
single: Common Gateway Interface
Support module for Common Gateway Interface (CGI) scripts.
This module defines a number of utilities for use by CGI scripts written in
Python.
Introduction
------------
A CGI script is invoked by an HTTP server, usually to process user input
submitted through an HTML ``<FORM>`` or ``<ISINDEX>`` element.
Most often, CGI scripts live in the server's special cgi-bin directory.
The HTTP server places all sorts of information about the request (such as the
client's hostname, the requested URL, the query string, and lots of other
goodies) in the script's shell environment, executes the script, and sends the
script's output back to the client.
The script's input is connected to the client too, and sometimes the form data
is read this way; at other times the form data is passed via the "query string"
part of the URL. This module is intended to take care of the different cases
and provide a simpler interface to the Python script. It also provides a number
of utilities that help in debugging scripts, and the latest addition is support
for file uploads from a form (if your browser supports it).
The output of a CGI script should consist of two sections, separated by a blank
line. The first section contains a number of headers, telling the client what
kind of data is following. Python code to generate a minimal header section
looks like this:: >
print "Content-Type: text/html" # HTML is following
print # blank line, end of headers
<
The second section is usually HTML, which allows the client software to display
nicely formatted text with header, in-line images, etc. Here's Python code that
prints a simple piece of HTML:: >
print "<TITLE>CGI script output</TITLE>"
print "<H1>This is my first CGI script</H1>"
print "Hello, world!"
<
Using the cgi module
Begin by writing ``import cgi``. Do not use ``from cgi import *`` --- the
module defines all sorts of names for its own use or for backward compatibility
that you don't want in your namespace.
When you write a new script, consider adding these lines:: >
import cgitb
cgitb.enable()
<
This activates a special exception handler that will display detailed reports in
the Web browser if any errors occur. If you'd rather not show the guts of your
program to users of your script, you can have the reports saved to files
instead, with code like this:: >
import cgitb
cgitb.enable(display=0, logdir="/tmp")
<
It's very helpful to use this feature during script development. The reports
produced by cgitb (|py2stdlib-cgitb|) provide information that can save you a lot of time in
tracking down bugs. You can always remove the ``cgitb`` line later when you
have tested your script and are confident that it works correctly.
To get at submitted form data, it's best to use the FieldStorage class.
The other classes defined in this module are provided mostly for backward
compatibility. Instantiate it exactly once, without arguments. This reads the
form contents from standard input or the environment (depending on the value of
various environment variables set according to the CGI standard). Since it may
consume standard input, it should be instantiated only once.
The FieldStorage instance can be indexed like a Python dictionary.
It allows membership testing with the in operator, and also supports
the standard dictionary method keys and the built-in function
len. Form fields containing empty strings are ignored and do not appear
in the dictionary; to keep such values, provide a true value for the optional
{keep_blank_values} keyword parameter when creating the FieldStorage
instance.
For instance, the following code (which assumes that the
Content-Type header and blank line have already been printed)
checks that the fields ``name`` and ``addr`` are both set to a non-empty
string:: >
form = cgi.FieldStorage()
if "name" not in form or "addr" not in form:
print "<H1>Error</H1>"
print "Please fill in the name and addr fields."
return
print "<p>name:", form["name"].value
print "<p>addr:", form["addr"].value
...further form processing here...
<
Here the fields, accessed through ``form[key]``, are themselves instances of
FieldStorage (or MiniFieldStorage, depending on the form
encoding). The value attribute of the instance yields the string value
of the field. The getvalue method returns this string value directly;
it also accepts an optional second argument as a default to return if the
requested key is not present.
If the submitted form data contains more than one field with the same name, the
object retrieved by ``form[key]`` is not a FieldStorage or
MiniFieldStorage instance but a list of such instances. Similarly, in
this situation, ``form.getvalue(key)`` would return a list of strings. If you
expect this possibility (when your HTML form contains multiple fields with the
same name), use the getlist function, which always returns a list of
values (so that you do not need to special-case the single item case). For
example, this code concatenates any number of username fields, separated by
commas:: >
value = form.getlist("username")
usernames = ",".join(value)
<
If a field represents an uploaded file, accessing the value via the
value attribute or the getvalue method reads the entire file in
memory as a string. This may not be what you want. You can test for an uploaded
file by testing either the filename attribute or the !file
attribute. You can then read the data at leisure from the !file
attribute:: >
fileitem = form["userfile"]
if fileitem.file:
# It's an uploaded file; count lines
linecount = 0
while 1:
line = fileitem.file.readline()
if not line: break
linecount = linecount + 1
<
If an error is encountered when obtaining the contents of an uploaded file
(for example, when the user interrupts the form submission by clicking on
a Back or Cancel button) the done attribute of the object for the
field will be set to the value -1.
The file upload draft standard entertains the possibility of uploading multiple
files from one field (using a recursive multipart/\* encoding).
When this occurs, the item will be a dictionary-like FieldStorage item.
This can be determined by testing its !type attribute, which should be
multipart/form-data (or perhaps another MIME type matching
multipart/\*). In this case, it can be iterated over recursively
just like the top-level form object.
When a form is submitted in the "old" format (as the query string or as a single
data part of type application/x-www-form-urlencoded), the items will
actually be instances of the class MiniFieldStorage. In this case, the
!list, !file, and filename attributes are always ``None``.
A form submitted via POST that also has a query string will contain both
FieldStorage and MiniFieldStorage items.
Higher Level Interface
----------------------
.. versionadded:: 2.2
The previous section explains how to read CGI form data using the
FieldStorage class. This section describes a higher level interface
which was added to this class to allow one to do it in a more readable and
intuitive way. The interface doesn't make the techniques described in previous
sections obsolete --- they are still useful to process file uploads efficiently,
for example.
.. XXX: Is this true ?
The interface consists of two simple methods. Using the methods you can process
form data in a generic way, without the need to worry whether only one or more
values were posted under one name.
In the previous section, you learned to write following code anytime you
expected a user to post more than one value under one name:: >
item = form.getvalue("item")
if isinstance(item, list):
# The user is requesting more than one item.
else:
# The user is requesting only one item.
<
This situation is common for example when a form contains a group of multiple
checkboxes with the same name:: >
<input type="checkbox" name="item" value="1" />
<input type="checkbox" name="item" value="2" />
<
In most situations, however, there's only one form control with a particular
name in a form and then you expect and need only one value associated with this
name. So you write a script containing for example this code:: >
user = form.getvalue("user").upper()
<
The problem with the code is that you should never expect that a client will
provide valid input to your scripts. For example, if a curious user appends
another ``user=foo`` pair to the query string, then the script would crash,
because in this situation the ``getvalue("user")`` method call returns a list
instead of a string. Calling the str.upper method on a list is not valid
(since lists do not have a method of this name) and results in an
AttributeError exception.
Therefore, the appropriate way to read form data values was to always use the
code which checks whether the obtained value is a single value or a list of
values. That's annoying and leads to less readable scripts.
A more convenient approach is to use the methods getfirst and
getlist provided by this higher level interface.
FieldStorage.getfirst(name[, default])~
This method always returns only one value associated with form field {name}.
The method returns only the first value in case that more values were posted
under such name. Please note that the order in which the values are received
may vary from browser to browser and should not be counted on. [#]_ If no such
form field or value exists then the method returns the value specified by the
optional parameter {default}. This parameter defaults to ``None`` if not
specified.
FieldStorage.getlist(name)~
This method always returns a list of values associated with form field {name}.
The method returns an empty list if no such form field or value exists for
{name}. It returns a list consisting of one item if only one such value exists.
Using these methods you can write nice compact code:: >
import cgi
form = cgi.FieldStorage()
user = form.getfirst("user", "").upper() # This way it's safe.
for item in form.getlist("item"):
do_something(item)
<
Old classes
2.6~
These classes, present in earlier versions of the cgi (|py2stdlib-cgi|) module, are
still supported for backward compatibility. New applications should use the
FieldStorage class.
SvFormContentDict stores single value form content as dictionary; it
assumes each field name occurs in the form only once.
FormContentDict stores multiple value form content as a dictionary (the
form items are lists of values). Useful if your form contains multiple fields
with the same name.
Other classes (FormContent, InterpFormContentDict) are present
for backwards compatibility with really old applications only.
Functions
---------
These are useful if you want more control, or if you want to employ some of the
algorithms implemented in this module in other circumstances.
parse(fp[, keep_blank_values[, strict_parsing]])~
Parse a query in the environment or from a file (the file defaults to
``sys.stdin``). The {keep_blank_values} and {strict_parsing} parameters are
passed to urlparse.parse_qs unchanged.
parse_qs(qs[, keep_blank_values[, strict_parsing]])~
This function is deprecated in this module. Use urlparse.parse_qs
instead. It is maintained here only for backward compatiblity.
parse_qsl(qs[, keep_blank_values[, strict_parsing]])~
This function is deprecated in this module. Use urlparse.parse_qsl
instead. It is maintained here only for backward compatiblity.
parse_multipart(fp, pdict)~
Parse input of type multipart/form-data (for file uploads).
Arguments are {fp} for the input file and {pdict} for a dictionary containing
other parameters in the Content-Type header.
Returns a dictionary just like urlparse.parse_qs keys are the field names, each
value is a list of values for that field. This is easy to use but not much good
if you are expecting megabytes to be uploaded --- in that case, use the
FieldStorage class instead which is much more flexible.
Note that this does not parse nested multipart parts --- use
FieldStorage for that.
parse_header(string)~
Parse a MIME header (such as Content-Type) into a main value and a
dictionary of parameters.
test()~
Robust test CGI script, usable as main program. Writes minimal HTTP headers and
formats all information provided to the script in HTML form.
print_environ()~
Format the shell environment in HTML.
print_form(form)~
Format a form in HTML.
print_directory()~
Format the current directory in HTML.
print_environ_usage()~
Print a list of useful (used by CGI) environment variables in HTML.
escape(s[, quote])~
Convert the characters ``'&'``, ``'<'`` and ``'>'`` in string {s} to HTML-safe
sequences. Use this if you need to display text that might contain such
characters in HTML. If the optional flag {quote} is true, the quotation mark
character (``'"'``) is also translated; this helps for inclusion in an HTML
attribute value, as in ``<A HREF="...">``. If the value to be quoted might
include single- or double-quote characters, or both, consider using the
quoteattr function in the xml.sax.saxutils (|py2stdlib-xml.sax.saxutils|) module instead.
Caring about security
---------------------
.. index:: pair: CGI; security
There's one important rule: if you invoke an external program (via the
os.system or os.popen functions. or others with similar
functionality), make very sure you don't pass arbitrary strings received from
the client to the shell. This is a well-known security hole whereby clever
hackers anywhere on the Web can exploit a gullible CGI script to invoke
arbitrary shell commands. Even parts of the URL or field names cannot be
trusted, since the request doesn't have to come from your form!
To be on the safe side, if you must pass a string gotten from a form to a shell
command, you should make sure the string contains only alphanumeric characters,
dashes, underscores, and periods.
Installing your CGI script on a Unix system
-------------------------------------------
Read the documentation for your HTTP server and check with your local system
administrator to find the directory where CGI scripts should be installed;
usually this is in a directory cgi-bin in the server tree.
Make sure that your script is readable and executable by "others"; the Unix file
mode should be ``0755`` octal (use ``chmod 0755 filename``). Make sure that the
first line of the script contains ``#!`` starting in column 1 followed by the
pathname of the Python interpreter, for instance:: >
#!/usr/local/bin/python
<
Make sure the Python interpreter exists and is executable by "others".
Make sure that any files your script needs to read or write are readable or
writable, respectively, by "others" --- their mode should be ``0644`` for
readable and ``0666`` for writable. This is because, for security reasons, the
HTTP server executes your script as user "nobody", without any special
privileges. It can only read (write, execute) files that everybody can read
(write, execute). The current directory at execution time is also different (it
is usually the server's cgi-bin directory) and the set of environment variables
is also different from what you get when you log in. In particular, don't count
on the shell's search path for executables (PATH) or the Python module
search path (PYTHONPATH) to be set to anything interesting.
If you need to load modules from a directory which is not on Python's default
module search path, you can change the path in your script, before importing
other modules. For example:: >
import sys
sys.path.insert(0, "/usr/home/joe/lib/python")
sys.path.insert(0, "/usr/local/lib/python")
<
(This way, the directory inserted last will be searched first!)
Instructions for non-Unix systems will vary; check your HTTP server's
documentation (it will usually have a section on CGI scripts).
Testing your CGI script
-----------------------
Unfortunately, a CGI script will generally not run when you try it from the
command line, and a script that works perfectly from the command line may fail
mysteriously when run from the server. There's one reason why you should still
test your script from the command line: if it contains a syntax error, the
Python interpreter won't execute it at all, and the HTTP server will most likely
send a cryptic error to the client.
Assuming your script has no syntax errors, yet it does not work, you have no
choice but to read the next section.
Debugging CGI scripts
---------------------
.. index:: pair: CGI; debugging
First of all, check for trivial installation errors --- reading the section
above on installing your CGI script carefully can save you a lot of time. If
you wonder whether you have understood the installation procedure correctly, try
installing a copy of this module file (cgi.py) as a CGI script. When
invoked as a script, the file will dump its environment and the contents of the
form in HTML form. Give it the right mode etc, and send it a request. If it's
installed in the standard cgi-bin directory, it should be possible to
send it a request by entering a URL into your browser of the form:: >
http://yourhostname/cgi-bin/cgi.py?name=Joe+Blow&addr=At+Home
<
If this gives an error of type 404, the server cannot find the script -- perhaps
you need to install it in a different directory. If it gives another error,
there's an installation problem that you should fix before trying to go any
further. If you get a nicely formatted listing of the environment and form
content (in this example, the fields should be listed as "addr" with value "At
Home" and "name" with value "Joe Blow"), the cgi.py script has been
installed correctly. If you follow the same procedure for your own script, you
should now be able to debug it.
The next step could be to call the cgi (|py2stdlib-cgi|) module's test (|py2stdlib-test|) function
from your script: replace its main code with the single statement :: >
cgi.test()
<
This should produce the same results as those gotten from installing the
cgi.py file itself.
When an ordinary Python script raises an unhandled exception (for whatever
reason: of a typo in a module name, a file that can't be opened, etc.), the
Python interpreter prints a nice traceback and exits. While the Python
interpreter will still do this when your CGI script raises an exception, most
likely the traceback will end up in one of the HTTP server's log files, or be
discarded altogether.
Fortunately, once you have managed to get your script to execute {some} code,
you can easily send tracebacks to the Web browser using the cgitb (|py2stdlib-cgitb|) module.
If you haven't done so already, just add the lines:: >
import cgitb
cgitb.enable()
<
to the top of your script. Then try running it again; when a problem occurs,
you should see a detailed report that will likely make apparent the cause of the
crash.
If you suspect that there may be a problem in importing the cgitb (|py2stdlib-cgitb|) module,
you can use an even more robust approach (which only uses built-in modules):: >
import sys
sys.stderr = sys.stdout
print "Content-Type: text/plain"
print
...your code here...
<
This relies on the Python interpreter to print the traceback. The content type
of the output is set to plain text, which disables all HTML processing. If your
script works, the raw HTML will be displayed by your client. If it raises an
exception, most likely after the first two lines have been printed, a traceback
will be displayed. Because no HTML interpretation is going on, the traceback
will be readable.
Common problems and solutions
-----------------------------
* Most HTTP servers buffer the output from CGI scripts until the script is
completed. This means that it is not possible to display a progress report on
the client's display while the script is running.
* Check the installation instructions above.
* Check the HTTP server's log files. (``tail -f logfile`` in a separate window
may be useful!)
* Always check a script for syntax errors first, by doing something like
``python script.py``.
* If your script does not have any syntax errors, try adding ``import cgitb;
cgitb.enable()`` to the top of the script.
* When invoking external programs, make sure they can be found. Usually, this
means using absolute path names --- PATH is usually not set to a very
useful value in a CGI script.
* When reading or writing external files, make sure they can be read or written
by the userid under which your CGI script will be running: this is typically the
userid under which the web server is running, or some explicitly specified
userid for a web server's ``suexec`` feature.
* Don't try to give a CGI script a set-uid mode. This doesn't work on most
systems, and is a security liability as well.
.. rubric:: Footnotes
.. [#] Note that some recent versions of the HTML specification do state what order the
field values should be supplied in, but knowing whether a request was
received from a conforming browser, or even from a browser at all, is tedious
and error-prone.
==============================================================================
*py2stdlib-cgihttpserver*
CGIHTTPServer~
:synopsis: This module provides a request handler for HTTP servers which can run CGI
scripts.
.. note::
The CGIHTTPServer (|py2stdlib-cgihttpserver|) module has been merged into http.server in
Python 3.0. The 2to3 tool will automatically adapt imports when
converting your sources to 3.0.
The CGIHTTPServer (|py2stdlib-cgihttpserver|) module defines a request-handler class, interface
compatible with BaseHTTPServer.BaseHTTPRequestHandler and inherits
behavior from SimpleHTTPServer.SimpleHTTPRequestHandler but can also
run CGI scripts.
.. note::
This module can run CGI scripts on Unix and Windows systems.
.. note::
CGI scripts run by the CGIHTTPRequestHandler class cannot execute
redirects (HTTP code 302), because code 200 (script output follows) is sent
prior to execution of the CGI script. This pre-empts the status code.
The CGIHTTPServer (|py2stdlib-cgihttpserver|) module defines the following class:
CGIHTTPRequestHandler(request, client_address, server)~
This class is used to serve either files or output of CGI scripts from the
current directory and below. Note that mapping HTTP hierarchic structure to
local directory structure is exactly as in
SimpleHTTPServer.SimpleHTTPRequestHandler.
The class will however, run the CGI script, instead of serving it as a file, if
it guesses it to be a CGI script. Only directory-based CGI are used --- the
other common server configuration is to treat special extensions as denoting CGI
scripts.
The do_GET and do_HEAD functions are modified to run CGI scripts
and serve the output, instead of serving files, if the request leads to
somewhere below the ``cgi_directories`` path.
The CGIHTTPRequestHandler defines the following data member:
cgi_directories~
This defaults to ``['/cgi-bin', '/htbin']`` and describes directories to
treat as containing CGI scripts.
The CGIHTTPRequestHandler defines the following methods:
do_POST()~
This method serves the ``'POST'`` request type, only allowed for CGI
scripts. Error 501, "Can only POST to CGI scripts", is output when trying
to POST to a non-CGI url.
Note that CGI scripts will be run with UID of user nobody, for security reasons.
Problems with the CGI script will be translated to error 403.
For example usage, see the implementation of the test (|py2stdlib-test|) function.
.. seealso::
Module BaseHTTPServer (|py2stdlib-basehttpserver|)
Base class implementation for Web server and request handler.
==============================================================================
*py2stdlib-cgitb*
cgitb~
:synopsis: Configurable traceback handler for CGI scripts.
.. versionadded:: 2.2
.. index::
single: CGI; exceptions
single: CGI; tracebacks
single: exceptions; in CGI scripts
single: tracebacks; in CGI scripts
The cgitb (|py2stdlib-cgitb|) module provides a special exception handler for Python scripts.
(Its name is a bit misleading. It was originally designed to display extensive
traceback information in HTML for CGI scripts. It was later generalized to also
display this information in plain text.) After this module is activated, if an
uncaught exception occurs, a detailed, formatted report will be displayed. The
report includes a traceback showing excerpts of the source code for each level,
as well as the values of the arguments and local variables to currently running
functions, to help you debug the problem. Optionally, you can save this
information to a file instead of sending it to the browser.
To enable this feature, simply add this to the top of your CGI script:: >
import cgitb
cgitb.enable()
<
The options to the enable function control whether the report is
displayed in the browser and whether the report is logged to a file for later
analysis.
enable([display[, logdir[, context[, format]]]])~
.. index:: single: excepthook() (in module sys)
This function causes the cgitb (|py2stdlib-cgitb|) module to take over the interpreter's
default handling for exceptions by setting the value of sys.excepthook.
The optional argument {display} defaults to ``1`` and can be set to ``0`` to
suppress sending the traceback to the browser. If the argument {logdir} is
present, the traceback reports are written to files. The value of {logdir}
should be a directory where these files will be placed. The optional argument
{context} is the number of lines of context to display around the current line
of source code in the traceback; this defaults to ``5``. If the optional
argument {format} is ``"html"``, the output is formatted as HTML. Any other
value forces plain text output. The default value is ``"html"``.
handler([info])~
This function handles an exception using the default settings (that is, show a
report in the browser, but don't log to a file). This can be used when you've
caught an exception and want to report it using cgitb (|py2stdlib-cgitb|). The optional
{info} argument should be a 3-tuple containing an exception type, exception
value, and traceback object, exactly like the tuple returned by
sys.exc_info. If the {info} argument is not supplied, the current
exception is obtained from sys.exc_info.
==============================================================================
*py2stdlib-chunk*
chunk~
:synopsis: Module to read IFF chunks.
.. index::
single: Audio Interchange File Format
single: AIFF
single: AIFF-C
single: Real Media File Format
single: RMFF
This module provides an interface for reading files that use EA IFF 85 chunks.
[#]_ This format is used in at least the Audio Interchange File Format
(AIFF/AIFF-C) and the Real Media File Format (RMFF). The WAVE audio file format
is closely related and can also be read using this module.
A chunk has the following structure:
+---------+--------+-------------------------------+
| Offset | Length | Contents |
+=========+========+===============================+
| 0 | 4 | Chunk ID |
+---------+--------+-------------------------------+
| 4 | 4 | Size of chunk in big-endian |
| | | byte order, not including the |
| | | header |
+---------+--------+-------------------------------+
| 8 | {n} | Data bytes, where {n} is the |
| | | size given in the preceding |
| | | field |
+---------+--------+-------------------------------+
| 8 + {n} | 0 or 1 | Pad byte needed if {n} is odd |
| | | and chunk alignment is used |
+---------+--------+-------------------------------+
The ID is a 4-byte string which identifies the type of chunk.
The size field (a 32-bit value, encoded using big-endian byte order) gives the
size of the chunk data, not including the 8-byte header.
Usually an IFF-type file consists of one or more chunks. The proposed usage of
the Chunk class defined here is to instantiate an instance at the start
of each chunk and read from the instance until it reaches the end, after which a
new instance can be instantiated. At the end of the file, creating a new
instance will fail with a EOFError exception.
Chunk(file[, align, bigendian, inclheader])~
Class which represents a chunk. The {file} argument is expected to be a
file-like object. An instance of this class is specifically allowed. The
only method that is needed is read. If the methods seek and
tell are present and don't raise an exception, they are also used.
If these methods are present and raise an exception, they are expected to not
have altered the object. If the optional argument {align} is true, chunks
are assumed to be aligned on 2-byte boundaries. If {align} is false, no
alignment is assumed. The default value is true. If the optional argument
{bigendian} is false, the chunk size is assumed to be in little-endian order.
This is needed for WAVE audio files. The default value is true. If the
optional argument {inclheader} is true, the size given in the chunk header
includes the size of the header. The default value is false.
A Chunk object supports the following methods:
getname()~
Returns the name (ID) of the chunk. This is the first 4 bytes of the
chunk.
getsize()~
Returns the size of the chunk.
close()~
Close and skip to the end of the chunk. This does not close the
underlying file.
The remaining methods will raise IOError if called after the
close method has been called.
isatty()~
Returns ``False``.
seek(pos[, whence])~
Set the chunk's current position. The {whence} argument is optional and
defaults to ``0`` (absolute file positioning); other values are ``1``
(seek relative to the current position) and ``2`` (seek relative to the
file's end). There is no return value. If the underlying file does not
allow seek, only forward seeks are allowed.
tell()~
Return the current position into the chunk.
read([size])~
Read at most {size} bytes from the chunk (less if the read hits the end of
the chunk before obtaining {size} bytes). If the {size} argument is
negative or omitted, read all data until the end of the chunk. The bytes
are returned as a string object. An empty string is returned when the end
of the chunk is encountered immediately.
skip()~
Skip to the end of the chunk. All further calls to read for the
chunk will return ``''``. If you are not interested in the contents of
the chunk, this method should be called so that the file points to the
start of the next chunk.
.. rubric:: Footnotes
.. [#] "EA IFF 85" Standard for Interchange Format Files, Jerry Morrison, Electronic
Arts, January 1985.
==============================================================================
*py2stdlib-cmath*
cmath~
:synopsis: Mathematical functions for complex numbers.
This module is always available. It provides access to mathematical functions
for complex numbers. The functions in this module accept integers,
floating-point numbers or complex numbers as arguments. They will also accept
any Python object that has either a __complex__ or a __float__
method: these methods are used to convert the object to a complex or
floating-point number, respectively, and the function is then applied to the
result of the conversion.
.. note::
On platforms with hardware and system-level support for signed
zeros, functions involving branch cuts are continuous on {both}
sides of the branch cut: the sign of the zero distinguishes one
side of the branch cut from the other. On platforms that do not
support signed zeros the continuity is as specified below.
Conversions to and from polar coordinates
-----------------------------------------
A Python complex number ``z`` is stored internally using {rectangular}
or {Cartesian} coordinates. It is completely determined by its *real
part{ ``z.real`` and its }imaginary part* ``z.imag``. In other
words:: >
z == z.real + z.imag*1j
<
{Polar coordinates} give an alternative way to represent a complex
number. In polar coordinates, a complex number {z} is defined by the
modulus {r} and the phase angle {phi}. The modulus {r} is the distance
from {z} to the origin, while the phase {phi} is the counterclockwise
angle, measured in radians, from the positive x-axis to the line
segment that joins the origin to {z}.
The following functions can be used to convert from the native
rectangular coordinates to polar coordinates and back.
phase(x)~
Return the phase of {x} (also known as the {argument} of {x}), as a
float. ``phase(x)`` is equivalent to ``math.atan2(x.imag,
x.real)``. The result lies in the range [-π, π], and the branch
cut for this operation lies along the negative real axis,
continuous from above. On systems with support for signed zeros
(which includes most systems in current use), this means that the
sign of the result is the same as the sign of ``x.imag``, even when
``x.imag`` is zero:: >
>>> phase(complex(-1.0, 0.0))
3.1415926535897931
>>> phase(complex(-1.0, -0.0))
-3.1415926535897931
<
.. versionadded:: 2.6
.. note::
The modulus (absolute value) of a complex number {x} can be
computed using the built-in abs function. There is no
separate cmath (|py2stdlib-cmath|) module function for this operation.
polar(x)~
Return the representation of {x} in polar coordinates. Returns a
pair ``(r, phi)`` where {r} is the modulus of {x} and phi is the
phase of {x}. ``polar(x)`` is equivalent to ``(abs(x),
phase(x))``.
.. versionadded:: 2.6
rect(r, phi)~
Return the complex number {x} with polar coordinates {r} and {phi}.
Equivalent to ``r { (math.cos(phi) + math.sin(phi)}1j)``.
.. versionadded:: 2.6
Power and logarithmic functions
-------------------------------
exp(x)~
Return the exponential value ``e{}x``.
log(x[, base])~
Returns the logarithm of {x} to the given {base}. If the {base} is not
specified, returns the natural logarithm of {x}. There is one branch cut, from 0
along the negative real axis to -∞, continuous from above.
.. versionchanged:: 2.4
{base} argument added.
log10(x)~
Return the base-10 logarithm of {x}. This has the same branch cut as
log.
sqrt(x)~
Return the square root of {x}. This has the same branch cut as log.
Trigonometric functions
-----------------------
acos(x)~
Return the arc cosine of {x}. There are two branch cuts: One extends right from
1 along the real axis to ∞, continuous from below. The other extends left from
-1 along the real axis to -∞, continuous from above.
asin(x)~
Return the arc sine of {x}. This has the same branch cuts as acos.
atan(x)~
Return the arc tangent of {x}. There are two branch cuts: One extends from
``1j`` along the imaginary axis to ``∞j``, continuous from the right. The
other extends from ``-1j`` along the imaginary axis to ``-∞j``, continuous
from the left.
.. versionchanged:: 2.6
direction of continuity of upper cut reversed
cos(x)~
Return the cosine of {x}.
sin(x)~
Return the sine of {x}.
tan(x)~
Return the tangent of {x}.
Hyperbolic functions
--------------------
acosh(x)~
Return the hyperbolic arc cosine of {x}. There is one branch cut, extending left
from 1 along the real axis to -∞, continuous from above.
asinh(x)~
Return the hyperbolic arc sine of {x}. There are two branch cuts:
One extends from ``1j`` along the imaginary axis to ``∞j``,
continuous from the right. The other extends from ``-1j`` along
the imaginary axis to ``-∞j``, continuous from the left.
.. versionchanged:: 2.6
branch cuts moved to match those recommended by the C99 standard
atanh(x)~
Return the hyperbolic arc tangent of {x}. There are two branch cuts: One
extends from ``1`` along the real axis to ``∞``, continuous from below. The
other extends from ``-1`` along the real axis to ``-∞``, continuous from
above.
.. versionchanged:: 2.6
direction of continuity of right cut reversed
cosh(x)~
Return the hyperbolic cosine of {x}.
sinh(x)~
Return the hyperbolic sine of {x}.
tanh(x)~
Return the hyperbolic tangent of {x}.
Classification functions
------------------------
isinf(x)~
Return {True} if the real or the imaginary part of x is positive
or negative infinity.
.. versionadded:: 2.6
isnan(x)~
Return {True} if the real or imaginary part of x is not a number (NaN).
.. versionadded:: 2.6
Constants
---------
pi~
The mathematical constant {π}, as a float.
e~
The mathematical constant {e}, as a float.
.. index:: module: math
Note that the selection of functions is similar, but not identical, to that in
module math (|py2stdlib-math|). The reason for having two modules is that some users aren't
interested in complex numbers, and perhaps don't even know what they are. They
would rather have ``math.sqrt(-1)`` raise an exception than return a complex
number. Also note that the functions defined in cmath (|py2stdlib-cmath|) always return a
complex number, even if the answer can be expressed as a real number (in which
case the complex number has an imaginary part of zero).
A note on branch cuts: They are curves along which the given function fails to
be continuous. They are a necessary feature of many complex functions. It is
assumed that if you need to compute with complex functions, you will understand
about branch cuts. Consult almost any (not too elementary) book on complex
variables for enlightenment. For information of the proper choice of branch
cuts for numerical purposes, a good reference should be the following:
.. seealso::
Kahan, W: Branch cuts for complex elementary functions; or, Much ado about
nothing's sign bit. In Iserles, A., and Powell, M. (eds.), The state of the art
in numerical analysis. Clarendon Press (1987) pp165-211.
==============================================================================
*py2stdlib-cmd*
cmd~
:synopsis: Build line-oriented command interpreters.
The Cmd class provides a simple framework for writing line-oriented
command interpreters. These are often useful for test harnesses, administrative
tools, and prototypes that will later be wrapped in a more sophisticated
interface.
Cmd([completekey[, stdin[, stdout]]])~
A Cmd instance or subclass instance is a line-oriented interpreter
framework. There is no good reason to instantiate Cmd itself; rather,
it's useful as a superclass of an interpreter class you define yourself in order
to inherit Cmd's methods and encapsulate action methods.
The optional argument {completekey} is the readline (|py2stdlib-readline|) name of a completion
key; it defaults to Tab. If {completekey} is not None and
readline (|py2stdlib-readline|) is available, command completion is done automatically.
The optional arguments {stdin} and {stdout} specify the input and output file
objects that the Cmd instance or subclass instance will use for input and
output. If not specified, they will default to sys.stdin and
sys.stdout.
If you want a given {stdin} to be used, make sure to set the instance's
use_rawinput attribute to ``False``, otherwise {stdin} will be
ignored.
.. versionchanged:: 2.3
The {stdin} and {stdout} parameters were added.
Cmd Objects
-----------
A Cmd instance has the following methods:
Cmd.cmdloop([intro])~
Repeatedly issue a prompt, accept input, parse an initial prefix off the
received input, and dispatch to action methods, passing them the remainder of
the line as argument.
The optional argument is a banner or intro string to be issued before the first
prompt (this overrides the intro class member).
If the readline (|py2stdlib-readline|) module is loaded, input will automatically inherit
bash\ -like history-list editing (e.g. Control-P scrolls back
to the last command, Control-N forward to the next one, Control-F
moves the cursor to the right non-destructively, Control-B moves the
cursor to the left non-destructively, etc.).
An end-of-file on input is passed back as the string ``'EOF'``.
An interpreter instance will recognize a command name ``foo`` if and only if it
has a method do_foo. As a special case, a line beginning with the
character ``'?'`` is dispatched to the method do_help. As another
special case, a line beginning with the character ``'!'`` is dispatched to the
method do_shell (if such a method is defined).
This method will return when the postcmd method returns a true value.
The {stop} argument to postcmd is the return value from the command's
corresponding do_\* method.
If completion is enabled, completing commands will be done automatically, and
completing of commands args is done by calling complete_foo with
arguments {text}, {line}, {begidx}, and {endidx}. {text} is the string prefix
we are attempting to match: all returned matches must begin with it. {line} is
the current input line with leading whitespace removed, {begidx} and {endidx}
are the beginning and ending indexes of the prefix text, which could be used to
provide different completion depending upon which position the argument is in.
All subclasses of Cmd inherit a predefined do_help. This
method, called with an argument ``'bar'``, invokes the corresponding method
help_bar. With no argument, do_help lists all available help
topics (that is, all commands with corresponding help_\* methods), and
also lists any undocumented commands.
Cmd.onecmd(str)~
Interpret the argument as though it had been typed in response to the prompt.
This may be overridden, but should not normally need to be; see the
precmd and postcmd methods for useful execution hooks. The
return value is a flag indicating whether interpretation of commands by the
interpreter should stop. If there is a do_\* method for the command
{str}, the return value of that method is returned, otherwise the return value
from the default method is returned.
Cmd.emptyline()~
Method called when an empty line is entered in response to the prompt. If this
method is not overridden, it repeats the last nonempty command entered.
Cmd.default(line)~
Method called on an input line when the command prefix is not recognized. If
this method is not overridden, it prints an error message and returns.
Cmd.completedefault(text, line, begidx, endidx)~
Method called to complete an input line when no command-specific
complete_\* method is available. By default, it returns an empty list.
Cmd.precmd(line)~
Hook method executed just before the command line {line} is interpreted, but
after the input prompt is generated and issued. This method is a stub in
Cmd; it exists to be overridden by subclasses. The return value is
used as the command which will be executed by the onecmd method; the
precmd implementation may re-write the command or simply return {line}
unchanged.
Cmd.postcmd(stop, line)~
Hook method executed just after a command dispatch is finished. This method is
a stub in Cmd; it exists to be overridden by subclasses. {line} is the
command line which was executed, and {stop} is a flag which indicates whether
execution will be terminated after the call to postcmd; this will be the
return value of the onecmd method. The return value of this method will
be used as the new value for the internal flag which corresponds to {stop};
returning false will cause interpretation to continue.
Cmd.preloop()~
Hook method executed once when cmdloop is called. This method is a stub
in Cmd; it exists to be overridden by subclasses.
Cmd.postloop()~
Hook method executed once when cmdloop is about to return. This method
is a stub in Cmd; it exists to be overridden by subclasses.
Instances of Cmd subclasses have some public instance variables:
Cmd.prompt~
The prompt issued to solicit input.
Cmd.identchars~
The string of characters accepted for the command prefix.
Cmd.lastcmd~
The last nonempty command prefix seen.
Cmd.intro~
A string to issue as an intro or banner. May be overridden by giving the
cmdloop method an argument.
Cmd.doc_header~
The header to issue if the help output has a section for documented commands.
Cmd.misc_header~
The header to issue if the help output has a section for miscellaneous help
topics (that is, there are help_\* methods without corresponding
do_\* methods).
Cmd.undoc_header~
The header to issue if the help output has a section for undocumented commands
(that is, there are do_\{ methods without corresponding help_\}
methods).
Cmd.ruler~
The character used to draw separator lines under the help-message headers. If
empty, no ruler line is drawn. It defaults to ``'='``.
Cmd.use_rawinput~
A flag, defaulting to true. If true, cmdloop uses raw_input to
display a prompt and read the next command; if false, sys.stdout.write
and sys.stdin.readline are used. (This means that by importing
readline (|py2stdlib-readline|), on systems that support it, the interpreter will automatically
support Emacs\ -like line editing and command-history keystrokes.)
==============================================================================
*py2stdlib-code*
code~
:synopsis: Facilities to implement read-eval-print loops.
The ``code`` module provides facilities to implement read-eval-print loops in
Python. Two classes and convenience functions are included which can be used to
build applications which provide an interactive interpreter prompt.
InteractiveInterpreter([locals])~
This class deals with parsing and interpreter state (the user's namespace); it
does not deal with input buffering or prompting or input file naming (the
filename is always passed in explicitly). The optional {locals} argument
specifies the dictionary in which code will be executed; it defaults to a newly
created dictionary with key ``'__name__'`` set to ``'__console__'`` and key
``'__doc__'`` set to ``None``.
InteractiveConsole([locals[, filename]])~
Closely emulate the behavior of the interactive Python interpreter. This class
builds on InteractiveInterpreter and adds prompting using the familiar
``sys.ps1`` and ``sys.ps2``, and input buffering.
interact([banner[, readfunc[, local]]])~
Convenience function to run a read-eval-print loop. This creates a new instance
of InteractiveConsole and sets {readfunc} to be used as the
raw_input method, if provided. If {local} is provided, it is passed to
the InteractiveConsole constructor for use as the default namespace for
the interpreter loop. The interact method of the instance is then run
with {banner} passed as the banner to use, if provided. The console object is
discarded after use.
compile_command(source[, filename[, symbol]])~
This function is useful for programs that want to emulate Python's interpreter
main loop (a.k.a. the read-eval-print loop). The tricky part is to determine
when the user has entered an incomplete command that can be completed by
entering more text (as opposed to a complete command or a syntax error). This
function {almost} always makes the same decision as the real interpreter main
loop.
{source} is the source string; {filename} is the optional filename from which
source was read, defaulting to ``'<input>'``; and {symbol} is the optional
grammar start symbol, which should be either ``'single'`` (the default) or
``'eval'``.
Returns a code object (the same as ``compile(source, filename, symbol)``) if the
command is complete and valid; ``None`` if the command is incomplete; raises
SyntaxError if the command is complete and contains a syntax error, or
raises OverflowError or ValueError if the command contains an
invalid literal.
Interactive Interpreter Objects
-------------------------------
InteractiveInterpreter.runsource(source[, filename[, symbol]])~
Compile and run some source in the interpreter. Arguments are the same as for
compile_command; the default for {filename} is ``'<input>'``, and for
{symbol} is ``'single'``. One several things can happen:
* The input is incorrect; compile_command raised an exception
(SyntaxError or OverflowError). A syntax traceback will be
printed by calling the showsyntaxerror method. runsource
returns ``False``.
* The input is incomplete, and more input is required; compile_command
returned ``None``. runsource returns ``True``.
* The input is complete; compile_command returned a code object. The
code is executed by calling the runcode (which also handles run-time
exceptions, except for SystemExit). runsource returns ``False``.
The return value can be used to decide whether to use ``sys.ps1`` or ``sys.ps2``
to prompt the next line.
InteractiveInterpreter.runcode(code)~
Execute a code object. When an exception occurs, showtraceback is called
to display a traceback. All exceptions are caught except SystemExit,
which is allowed to propagate.
A note about KeyboardInterrupt: this exception may occur elsewhere in
this code, and may not always be caught. The caller should be prepared to deal
with it.
InteractiveInterpreter.showsyntaxerror([filename])~
Display the syntax error that just occurred. This does not display a stack
trace because there isn't one for syntax errors. If {filename} is given, it is
stuffed into the exception instead of the default filename provided by Python's
parser, because it always uses ``'<string>'`` when reading from a string. The
output is written by the write method.
InteractiveInterpreter.showtraceback()~
Display the exception that just occurred. We remove the first stack item
because it is within the interpreter object implementation. The output is
written by the write method.
InteractiveInterpreter.write(data)~
Write a string to the standard error stream (``sys.stderr``). Derived classes
should override this to provide the appropriate output handling as needed.
Interactive Console Objects
---------------------------
The InteractiveConsole class is a subclass of
InteractiveInterpreter, and so offers all the methods of the
interpreter objects as well as the following additions.
InteractiveConsole.interact([banner])~
Closely emulate the interactive Python console. The optional banner argument
specify the banner to print before the first interaction; by default it prints a
banner similar to the one printed by the standard Python interpreter, followed
by the class name of the console object in parentheses (so as not to confuse
this with the real interpreter -- since it's so close!).
InteractiveConsole.push(line)~
Push a line of source text to the interpreter. The line should not have a
trailing newline; it may have internal newlines. The line is appended to a
buffer and the interpreter's runsource method is called with the
concatenated contents of the buffer as source. If this indicates that the
command was executed or invalid, the buffer is reset; otherwise, the command is
incomplete, and the buffer is left as it was after the line was appended. The
return value is ``True`` if more input is required, ``False`` if the line was
dealt with in some way (this is the same as runsource).
InteractiveConsole.resetbuffer()~
Remove any unhandled source text from the input buffer.
InteractiveConsole.raw_input([prompt])~
Write a prompt and read a line. The returned line does not include the trailing
newline. When the user enters the EOF key sequence, EOFError is raised.
The base implementation uses the built-in function raw_input; a subclass
may replace this with a different implementation.
==============================================================================
*py2stdlib-codecs*
codecs~
:synopsis: Encode and decode data and streams.
.. index::
single: Unicode
single: Codecs
pair: Codecs; encode
pair: Codecs; decode
single: streams
pair: stackable; streams
This module defines base classes for standard Python codecs (encoders and
decoders) and provides access to the internal Python codec registry which
manages the codec and error handling lookup process.
It defines the following functions:
register(search_function)~
Register a codec search function. Search functions are expected to take one
argument, the encoding name in all lower case letters, and return a
CodecInfo object having the following attributes:
* ``name`` The name of the encoding;
* ``encode`` The stateless encoding function;
* ``decode`` The stateless decoding function;
* ``incrementalencoder`` An incremental encoder class or factory function;
* ``incrementaldecoder`` An incremental decoder class or factory function;
* ``streamwriter`` A stream writer class or factory function;
* ``streamreader`` A stream reader class or factory function.
The various functions or classes take the following arguments:
{encode} and {decode}: These must be functions or methods which have the same
interface as the encode/decode methods of Codec instances (see
Codec Interface). The functions/methods are expected to work in a stateless
mode.
{incrementalencoder} and {incrementaldecoder}: These have to be factory
functions providing the following interface:
``factory(errors='strict')``
The factory functions must return objects providing the interfaces defined by
the base classes IncrementalEncoder and IncrementalDecoder,
respectively. Incremental codecs can maintain state.
{streamreader} and {streamwriter}: These have to be factory functions providing
the following interface:
``factory(stream, errors='strict')``
The factory functions must return objects providing the interfaces defined by
the base classes StreamWriter and StreamReader, respectively.
Stream codecs can maintain state.
Possible values for errors are
* ``'strict'``: raise an exception in case of an encoding error
* ``'replace'``: replace malformed data with a suitable replacement marker,
such as ``'?'`` or ``'\ufffd'``
* ``'ignore'``: ignore malformed data and continue without further notice
* ``'xmlcharrefreplace'``: replace with the appropriate XML character
reference (for encoding only)
* ``'backslashreplace'``: replace with backslashed escape sequences (for
encoding only)
as well as any other error handling name defined via register_error.
In case a search function cannot find a given encoding, it should return
``None``.
lookup(encoding)~
Looks up the codec info in the Python codec registry and returns a
CodecInfo object as defined above.
Encodings are first looked up in the registry's cache. If not found, the list of
registered search functions is scanned. If no CodecInfo object is
found, a LookupError is raised. Otherwise, the CodecInfo object
is stored in the cache and returned to the caller.
To simplify access to the various codecs, the module provides these additional
functions which use lookup for the codec lookup:
getencoder(encoding)~
Look up the codec for the given encoding and return its encoder function.
Raises a LookupError in case the encoding cannot be found.
getdecoder(encoding)~
Look up the codec for the given encoding and return its decoder function.
Raises a LookupError in case the encoding cannot be found.
getincrementalencoder(encoding)~
Look up the codec for the given encoding and return its incremental encoder
class or factory function.
Raises a LookupError in case the encoding cannot be found or the codec
doesn't support an incremental encoder.
.. versionadded:: 2.5
getincrementaldecoder(encoding)~
Look up the codec for the given encoding and return its incremental decoder
class or factory function.
Raises a LookupError in case the encoding cannot be found or the codec
doesn't support an incremental decoder.
.. versionadded:: 2.5
getreader(encoding)~
Look up the codec for the given encoding and return its StreamReader class or
factory function.
Raises a LookupError in case the encoding cannot be found.
getwriter(encoding)~
Look up the codec for the given encoding and return its StreamWriter class or
factory function.
Raises a LookupError in case the encoding cannot be found.
register_error(name, error_handler)~
Register the error handling function {error_handler} under the name {name}.
{error_handler} will be called during encoding and decoding in case of an error,
when {name} is specified as the errors parameter.
For encoding {error_handler} will be called with a UnicodeEncodeError
instance, which contains information about the location of the error. The error
handler must either raise this or a different exception or return a tuple with a
replacement for the unencodable part of the input and a position where encoding
should continue. The encoder will encode the replacement and continue encoding
the original input at the specified position. Negative position values will be
treated as being relative to the end of the input string. If the resulting
position is out of bound an IndexError will be raised.
Decoding and translating works similar, except UnicodeDecodeError or
UnicodeTranslateError will be passed to the handler and that the
replacement from the error handler will be put into the output directly.
lookup_error(name)~
Return the error handler previously registered under the name {name}.
Raises a LookupError in case the handler cannot be found.
strict_errors(exception)~
Implements the ``strict`` error handling: each encoding or decoding error
raises a UnicodeError.
replace_errors(exception)~
Implements the ``replace`` error handling: malformed data is replaced with a
suitable replacement character such as ``'?'`` in bytestrings and
``'\ufffd'`` in Unicode strings.
ignore_errors(exception)~
Implements the ``ignore`` error handling: malformed data is ignored and
encoding or decoding is continued without further notice.
xmlcharrefreplace_errors(exception)~
Implements the ``xmlcharrefreplace`` error handling (for encoding only): the
unencodable character is replaced by an appropriate XML character reference.
backslashreplace_errors(exception)~
Implements the ``backslashreplace`` error handling (for encoding only): the
unencodable character is replaced by a backslashed escape sequence.
To simplify working with encoded files or stream, the module also defines these
utility functions:
open(filename, mode[, encoding[, errors[, buffering]]])~
Open an encoded file using the given {mode} and return a wrapped version
providing transparent encoding/decoding. The default file mode is ``'r'``
meaning to open the file in read mode.
.. note:: >
The wrapped version will only accept the object format defined by the codecs,
i.e. Unicode objects for most built-in codecs. Output is also codec-dependent
and will usually be Unicode as well.
<
.. note::
Files are always opened in binary mode, even if no binary mode was
specified. This is done to avoid data loss due to encodings using 8-bit
values. This means that no automatic conversion of ``'\n'`` is done
on reading and writing.
{encoding} specifies the encoding which is to be used for the file.
{errors} may be given to define the error handling. It defaults to ``'strict'``
which causes a ValueError to be raised in case an encoding error occurs.
{buffering} has the same meaning as for the built-in open function. It
defaults to line buffered.
EncodedFile(file, input[, output[, errors]])~
Return a wrapped version of file which provides transparent encoding
translation.
Strings written to the wrapped file are interpreted according to the given
{input} encoding and then written to the original file as strings using the
{output} encoding. The intermediate encoding will usually be Unicode but depends
on the specified codecs.
If {output} is not given, it defaults to {input}.
{errors} may be given to define the error handling. It defaults to ``'strict'``,
which causes ValueError to be raised in case an encoding error occurs.
iterencode(iterable, encoding[, errors])~
Uses an incremental encoder to iteratively encode the input provided by
{iterable}. This function is a generator. {errors} (as well as any
other keyword argument) is passed through to the incremental encoder.
.. versionadded:: 2.5
iterdecode(iterable, encoding[, errors])~
Uses an incremental decoder to iteratively decode the input provided by
{iterable}. This function is a generator. {errors} (as well as any
other keyword argument) is passed through to the incremental decoder.
.. versionadded:: 2.5
The module also provides the following constants which are useful for reading
and writing to platform dependent files:
BOM~
BOM_BE
BOM_LE
BOM_UTF8
BOM_UTF16
BOM_UTF16_BE
BOM_UTF16_LE
BOM_UTF32
BOM_UTF32_BE
BOM_UTF32_LE
These constants define various encodings of the Unicode byte order mark (BOM)
used in UTF-16 and UTF-32 data streams to indicate the byte order used in the
stream or file and in UTF-8 as a Unicode signature. BOM_UTF16 is either
BOM_UTF16_BE or BOM_UTF16_LE depending on the platform's
native byte order, BOM is an alias for BOM_UTF16,
BOM_LE for BOM_UTF16_LE and BOM_BE for
BOM_UTF16_BE. The others represent the BOM in UTF-8 and UTF-32
encodings.
Codec Base Classes
------------------
The codecs (|py2stdlib-codecs|) module defines a set of base classes which define the
interface and can also be used to easily write your own codecs for use in
Python.
Each codec has to define four interfaces to make it usable as codec in Python:
stateless encoder, stateless decoder, stream reader and stream writer. The
stream reader and writers typically reuse the stateless encoder/decoder to
implement the file protocols.
The Codec class defines the interface for stateless encoders/decoders.
To simplify and standardize error handling, the encode and
decode methods may implement different error handling schemes by
providing the {errors} string argument. The following string values are defined
and implemented by all standard Python codecs:
+-------------------------+-----------------------------------------------+
| Value | Meaning |
+=========================+===============================================+
| ``'strict'`` | Raise UnicodeError (or a subclass); |
| | this is the default. |
+-------------------------+-----------------------------------------------+
| ``'ignore'`` | Ignore the character and continue with the |
| | next. |
+-------------------------+-----------------------------------------------+
| ``'replace'`` | Replace with a suitable replacement |
| | character; Python will use the official |
| | U+FFFD REPLACEMENT CHARACTER for the built-in |
| | Unicode codecs on decoding and '?' on |
| | encoding. |
+-------------------------+-----------------------------------------------+
| ``'xmlcharrefreplace'`` | Replace with the appropriate XML character |
| | reference (only for encoding). |
+-------------------------+-----------------------------------------------+
| ``'backslashreplace'`` | Replace with backslashed escape sequences |
| | (only for encoding). |
+-------------------------+-----------------------------------------------+
The set of allowed values can be extended via register_error.
Codec Objects
^^^^^^^^^^^^^
The Codec class defines these methods which also define the function
interfaces of the stateless encoder and decoder:
Codec.encode(input[, errors])~
Encodes the object {input} and returns a tuple (output object, length consumed).
While codecs are not restricted to use with Unicode, in a Unicode context,
encoding converts a Unicode object to a plain string using a particular
character set encoding (e.g., ``cp1252`` or ``iso-8859-1``).
{errors} defines the error handling to apply. It defaults to ``'strict'``
handling.
The method may not store state in the Codec instance. Use
StreamCodec for codecs which have to keep state in order to make
encoding/decoding efficient.
The encoder must be able to handle zero length input and return an empty object
of the output object type in this situation.
Codec.decode(input[, errors])~
Decodes the object {input} and returns a tuple (output object, length consumed).
In a Unicode context, decoding converts a plain string encoded using a
particular character set encoding to a Unicode object.
{input} must be an object which provides the ``bf_getreadbuf`` buffer slot.
Python strings, buffer objects and memory mapped files are examples of objects
providing this slot.
{errors} defines the error handling to apply. It defaults to ``'strict'``
handling.
The method may not store state in the Codec instance. Use
StreamCodec for codecs which have to keep state in order to make
encoding/decoding efficient.
The decoder must be able to handle zero length input and return an empty object
of the output object type in this situation.
The IncrementalEncoder and IncrementalDecoder classes provide
the basic interface for incremental encoding and decoding. Encoding/decoding the
input isn't done with one call to the stateless encoder/decoder function, but
with multiple calls to the encode/decode method of the
incremental encoder/decoder. The incremental encoder/decoder keeps track of the
encoding/decoding process during method calls.
The joined output of calls to the encode/decode method is the
same as if all the single inputs were joined into one, and this input was
encoded/decoded with the stateless encoder/decoder.
IncrementalEncoder Objects
^^^^^^^^^^^^^^^^^^^^^^^^^^
.. versionadded:: 2.5
The IncrementalEncoder class is used for encoding an input in multiple
steps. It defines the following methods which every incremental encoder must
define in order to be compatible with the Python codec registry.
IncrementalEncoder([errors])~
Constructor for an IncrementalEncoder instance.
All incremental encoders must provide this constructor interface. They are free
to add additional keyword arguments, but only the ones defined here are used by
the Python codec registry.
The IncrementalEncoder may implement different error handling schemes
by providing the {errors} keyword argument. These parameters are predefined:
* ``'strict'`` Raise ValueError (or a subclass); this is the default.
* ``'ignore'`` Ignore the character and continue with the next.
* ``'replace'`` Replace with a suitable replacement character
* ``'xmlcharrefreplace'`` Replace with the appropriate XML character reference
* ``'backslashreplace'`` Replace with backslashed escape sequences.
The {errors} argument will be assigned to an attribute of the same name.
Assigning to this attribute makes it possible to switch between different error
handling strategies during the lifetime of the IncrementalEncoder
object.
The set of allowed values for the {errors} argument can be extended with
register_error.
encode(object[, final])~
Encodes {object} (taking the current state of the encoder into account)
and returns the resulting encoded object. If this is the last call to
encode {final} must be true (the default is false).
reset()~
Reset the encoder to the initial state.
IncrementalDecoder Objects
^^^^^^^^^^^^^^^^^^^^^^^^^^
The IncrementalDecoder class is used for decoding an input in multiple
steps. It defines the following methods which every incremental decoder must
define in order to be compatible with the Python codec registry.
IncrementalDecoder([errors])~
Constructor for an IncrementalDecoder instance.
All incremental decoders must provide this constructor interface. They are free
to add additional keyword arguments, but only the ones defined here are used by
the Python codec registry.
The IncrementalDecoder may implement different error handling schemes
by providing the {errors} keyword argument. These parameters are predefined:
* ``'strict'`` Raise ValueError (or a subclass); this is the default.
* ``'ignore'`` Ignore the character and continue with the next.
* ``'replace'`` Replace with a suitable replacement character.
The {errors} argument will be assigned to an attribute of the same name.
Assigning to this attribute makes it possible to switch between different error
handling strategies during the lifetime of the IncrementalDecoder
object.
The set of allowed values for the {errors} argument can be extended with
register_error.
decode(object[, final])~
Decodes {object} (taking the current state of the decoder into account)
and returns the resulting decoded object. If this is the last call to
decode {final} must be true (the default is false). If {final} is
true the decoder must decode the input completely and must flush all
buffers. If this isn't possible (e.g. because of incomplete byte sequences
at the end of the input) it must initiate error handling just like in the
stateless case (which might raise an exception).
reset()~
Reset the decoder to the initial state.
The StreamWriter and StreamReader classes provide generic
working interfaces which can be used to implement new encoding submodules very
easily. See encodings.utf_8 for an example of how this is done.
StreamWriter Objects
^^^^^^^^^^^^^^^^^^^^
The StreamWriter class is a subclass of Codec and defines the
following methods which every stream writer must define in order to be
compatible with the Python codec registry.
StreamWriter(stream[, errors])~
Constructor for a StreamWriter instance.
All stream writers must provide this constructor interface. They are free to add
additional keyword arguments, but only the ones defined here are used by the
Python codec registry.
{stream} must be a file-like object open for writing binary data.
The StreamWriter may implement different error handling schemes by
providing the {errors} keyword argument. These parameters are predefined:
* ``'strict'`` Raise ValueError (or a subclass); this is the default.
* ``'ignore'`` Ignore the character and continue with the next.
* ``'replace'`` Replace with a suitable replacement character
* ``'xmlcharrefreplace'`` Replace with the appropriate XML character reference
* ``'backslashreplace'`` Replace with backslashed escape sequences.
The {errors} argument will be assigned to an attribute of the same name.
Assigning to this attribute makes it possible to switch between different error
handling strategies during the lifetime of the StreamWriter object.
The set of allowed values for the {errors} argument can be extended with
register_error.
write(object)~
Writes the object's contents encoded to the stream.
writelines(list)~
Writes the concatenated list of strings to the stream (possibly by reusing
the write method).
reset()~
Flushes and resets the codec buffers used for keeping state.
Calling this method should ensure that the data on the output is put into
a clean state that allows appending of new fresh data without having to
rescan the whole stream to recover state.
In addition to the above methods, the StreamWriter must also inherit
all other methods and attributes from the underlying stream.
StreamReader Objects
^^^^^^^^^^^^^^^^^^^^
The StreamReader class is a subclass of Codec and defines the
following methods which every stream reader must define in order to be
compatible with the Python codec registry.
StreamReader(stream[, errors])~
Constructor for a StreamReader instance.
All stream readers must provide this constructor interface. They are free to add
additional keyword arguments, but only the ones defined here are used by the
Python codec registry.
{stream} must be a file-like object open for reading (binary) data.
The StreamReader may implement different error handling schemes by
providing the {errors} keyword argument. These parameters are defined:
* ``'strict'`` Raise ValueError (or a subclass); this is the default.
* ``'ignore'`` Ignore the character and continue with the next.
* ``'replace'`` Replace with a suitable replacement character.
The {errors} argument will be assigned to an attribute of the same name.
Assigning to this attribute makes it possible to switch between different error
handling strategies during the lifetime of the StreamReader object.
The set of allowed values for the {errors} argument can be extended with
register_error.
read([size[, chars, [firstline]]])~
Decodes data from the stream and returns the resulting object.
{chars} indicates the number of characters to read from the
stream. read will never return more than {chars} characters, but
it might return less, if there are not enough characters available.
{size} indicates the approximate maximum number of bytes to read from the
stream for decoding purposes. The decoder can modify this setting as
appropriate. The default value -1 indicates to read and decode as much as
possible. {size} is intended to prevent having to decode huge files in
one step.
{firstline} indicates that it would be sufficient to only return the first
line, if there are decoding errors on later lines.
The method should use a greedy read strategy meaning that it should read
as much data as is allowed within the definition of the encoding and the
given size, e.g. if optional encoding endings or state markers are
available on the stream, these should be read too.
.. versionchanged:: 2.4
{chars} argument added.
.. versionchanged:: 2.4.2
{firstline} argument added.
readline([size[, keepends]])~
Read one line from the input stream and return the decoded data.
{size}, if given, is passed as size argument to the stream's
readline (|py2stdlib-readline|) method.
If {keepends} is false line-endings will be stripped from the lines
returned.
.. versionchanged:: 2.4
{keepends} argument added.
readlines([sizehint[, keepends]])~
Read all lines available on the input stream and return them as a list of
lines.
Line-endings are implemented using the codec's decoder method and are
included in the list entries if {keepends} is true.
{sizehint}, if given, is passed as the {size} argument to the stream's
read method.
reset()~
Resets the codec buffers used for keeping state.
Note that no stream repositioning should take place. This method is
primarily intended to be able to recover from decoding errors.
In addition to the above methods, the StreamReader must also inherit
all other methods and attributes from the underlying stream.
The next two base classes are included for convenience. They are not needed by
the codec registry, but may provide useful in practice.
StreamReaderWriter Objects
^^^^^^^^^^^^^^^^^^^^^^^^^^
The StreamReaderWriter allows wrapping streams which work in both read
and write modes.
The design is such that one can use the factory functions returned by the
lookup function to construct the instance.
StreamReaderWriter(stream, Reader, Writer, errors)~
Creates a StreamReaderWriter instance. {stream} must be a file-like
object. {Reader} and {Writer} must be factory functions or classes providing the
StreamReader and StreamWriter interface resp. Error handling
is done in the same way as defined for the stream readers and writers.
StreamReaderWriter instances define the combined interfaces of
StreamReader and StreamWriter classes. They inherit all other
methods and attributes from the underlying stream.
StreamRecoder Objects
^^^^^^^^^^^^^^^^^^^^^
The StreamRecoder provide a frontend - backend view of encoding data
which is sometimes useful when dealing with different encoding environments.
The design is such that one can use the factory functions returned by the
lookup function to construct the instance.
StreamRecoder(stream, encode, decode, Reader, Writer, errors)~
Creates a StreamRecoder instance which implements a two-way conversion:
{encode} and {decode} work on the frontend (the input to read and output
of write) while {Reader} and {Writer} work on the backend (reading and
writing to the stream).
You can use these objects to do transparent direct recodings from e.g. Latin-1
to UTF-8 and back.
{stream} must be a file-like object.
{encode}, {decode} must adhere to the Codec interface. {Reader},
{Writer} must be factory functions or classes providing objects of the
StreamReader and StreamWriter interface respectively.
{encode} and {decode} are needed for the frontend translation, {Reader} and
{Writer} for the backend translation. The intermediate format used is
determined by the two sets of codecs, e.g. the Unicode codecs will use Unicode
as the intermediate encoding.
Error handling is done in the same way as defined for the stream readers and
writers.
StreamRecoder instances define the combined interfaces of
StreamReader and StreamWriter classes. They inherit all other
methods and attributes from the underlying stream.
Encodings and Unicode
---------------------
Unicode strings are stored internally as sequences of codepoints (to be precise
as Py_UNICODE arrays). Depending on the way Python is compiled (either
via --enable-unicode=ucs2 or --enable-unicode=ucs4, with the
former being the default) Py_UNICODE is either a 16-bit or 32-bit data
type. Once a Unicode object is used outside of CPU and memory, CPU endianness
and how these arrays are stored as bytes become an issue. Transforming a
unicode object into a sequence of bytes is called encoding and recreating the
unicode object from the sequence of bytes is known as decoding. There are many
different methods for how this transformation can be done (these methods are
also called encodings). The simplest method is to map the codepoints 0-255 to
the bytes ``0x0``-``0xff``. This means that a unicode object that contains
codepoints above ``U+00FF`` can't be encoded with this method (which is called
``'latin-1'`` or ``'iso-8859-1'``). unicode.encode will raise a
UnicodeEncodeError that looks like this: ``UnicodeEncodeError: 'latin-1'
codec can't encode character u'\u1234' in position 3: ordinal not in
range(256)``.
There's another group of encodings (the so called charmap encodings) that choose
a different subset of all unicode code points and how these codepoints are
mapped to the bytes ``0x0``-``0xff``. To see how this is done simply open
e.g. encodings/cp1252.py (which is an encoding that is used primarily on
Windows). There's a string constant with 256 characters that shows you which
character is mapped to which byte value.
All of these encodings can only encode 256 of the 65536 (or 1114111) codepoints
defined in unicode. A simple and straightforward way that can store each Unicode
code point, is to store each codepoint as two consecutive bytes. There are two
possibilities: Store the bytes in big endian or in little endian order. These
two encodings are called UTF-16-BE and UTF-16-LE respectively. Their
disadvantage is that if e.g. you use UTF-16-BE on a little endian machine you
will always have to swap bytes on encoding and decoding. UTF-16 avoids this
problem: Bytes will always be in natural endianness. When these bytes are read
by a CPU with a different endianness, then bytes have to be swapped though. To
be able to detect the endianness of a UTF-16 byte sequence, there's the so
called BOM (the "Byte Order Mark"). This is the Unicode character ``U+FEFF``.
This character will be prepended to every UTF-16 byte sequence. The byte swapped
version of this character (``0xFFFE``) is an illegal character that may not
appear in a Unicode text. So when the first character in an UTF-16 byte sequence
appears to be a ``U+FFFE`` the bytes have to be swapped on decoding.
Unfortunately upto Unicode 4.0 the character ``U+FEFF`` had a second purpose as
a ``ZERO WIDTH NO-BREAK SPACE``: A character that has no width and doesn't allow
a word to be split. It can e.g. be used to give hints to a ligature algorithm.
With Unicode 4.0 using ``U+FEFF`` as a ``ZERO WIDTH NO-BREAK SPACE`` has been
deprecated (with ``U+2060`` (``WORD JOINER``) assuming this role). Nevertheless
Unicode software still must be able to handle ``U+FEFF`` in both roles: As a BOM
it's a device to determine the storage layout of the encoded bytes, and vanishes
once the byte sequence has been decoded into a Unicode string; as a ``ZERO WIDTH
NO-BREAK SPACE`` it's a normal character that will be decoded like any other.
There's another encoding that is able to encoding the full range of Unicode
characters: UTF-8. UTF-8 is an 8-bit encoding, which means there are no issues
with byte order in UTF-8. Each byte in a UTF-8 byte sequence consists of two
parts: Marker bits (the most significant bits) and payload bits. The marker bits
are a sequence of zero to six 1 bits followed by a 0 bit. Unicode characters are
encoded like this (with x being payload bits, which when concatenated give the
Unicode character):
+-----------------------------------+----------------------------------------------+
| Range | Encoding |
+===================================+==============================================+
| ``U-00000000`` ... ``U-0000007F`` | 0xxxxxxx |
+-----------------------------------+----------------------------------------------+
| ``U-00000080`` ... ``U-000007FF`` | 110xxxxx 10xxxxxx |
+-----------------------------------+----------------------------------------------+
| ``U-00000800`` ... ``U-0000FFFF`` | 1110xxxx 10xxxxxx 10xxxxxx |
+-----------------------------------+----------------------------------------------+
| ``U-00010000`` ... ``U-001FFFFF`` | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx |
+-----------------------------------+----------------------------------------------+
| ``U-00200000`` ... ``U-03FFFFFF`` | 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx |
+-----------------------------------+----------------------------------------------+
| ``U-04000000`` ... ``U-7FFFFFFF`` | 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx |
| | 10xxxxxx |
+-----------------------------------+----------------------------------------------+
The least significant bit of the Unicode character is the rightmost x bit.
As UTF-8 is an 8-bit encoding no BOM is required and any ``U+FEFF`` character in
the decoded Unicode string (even if it's the first character) is treated as a
``ZERO WIDTH NO-BREAK SPACE``.
Without external information it's impossible to reliably determine which
encoding was used for encoding a Unicode string. Each charmap encoding can
decode any random byte sequence. However that's not possible with UTF-8, as
UTF-8 byte sequences have a structure that doesn't allow arbitrary byte
sequences. To increase the reliability with which a UTF-8 encoding can be
detected, Microsoft invented a variant of UTF-8 (that Python 2.5 calls
``"utf-8-sig"``) for its Notepad program: Before any of the Unicode characters
is written to the file, a UTF-8 encoded BOM (which looks like this as a byte
sequence: ``0xef``, ``0xbb``, ``0xbf``) is written. As it's rather improbable
that any charmap encoded file starts with these byte values (which would e.g.
map to
| LATIN SMALL LETTER I WITH DIAERESIS
| RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
| INVERTED QUESTION MARK
in iso-8859-1), this increases the probability that a utf-8-sig encoding can be
correctly guessed from the byte sequence. So here the BOM is not used to be able
to determine the byte order used for generating the byte sequence, but as a
signature that helps in guessing the encoding. On encoding the utf-8-sig codec
will write ``0xef``, ``0xbb``, ``0xbf`` as the first three bytes to the file. On
decoding utf-8-sig will skip those three bytes if they appear as the first three
bytes in the file.
Standard Encodings
------------------
Python comes with a number of codecs built-in, either implemented as C functions
or with dictionaries as mapping tables. The following table lists the codecs by
name, together with a few common aliases, and the languages for which the
encoding is likely used. Neither the list of aliases nor the list of languages
is meant to be exhaustive. Notice that spelling alternatives that only differ in
case or use a hyphen instead of an underscore are also valid aliases; therefore,
e.g. ``'utf-8'`` is a valid alias for the ``'utf_8'`` codec.
Many of the character sets support the same languages. They vary in individual
characters (e.g. whether the EURO SIGN is supported or not), and in the
assignment of characters to code positions. For the European languages in
particular, the following variants typically exist:
* an ISO 8859 codeset
* a Microsoft Windows code page, which is typically derived from a 8859 codeset,
but replaces control characters with additional graphic characters
* an IBM EBCDIC code page
* an IBM PC code page, which is ASCII compatible
+-----------------+--------------------------------+--------------------------------+
| Codec | Aliases | Languages |
+=================+================================+================================+
| ascii | 646, us-ascii | English |
+-----------------+--------------------------------+--------------------------------+
| big5 | big5-tw, csbig5 | Traditional Chinese |
+-----------------+--------------------------------+--------------------------------+
| big5hkscs | big5-hkscs, hkscs | Traditional Chinese |
+-----------------+--------------------------------+--------------------------------+
| cp037 | IBM037, IBM039 | English |
+-----------------+--------------------------------+--------------------------------+
| cp424 | EBCDIC-CP-HE, IBM424 | Hebrew |
+-----------------+--------------------------------+--------------------------------+
| cp437 | 437, IBM437 | English |
+-----------------+--------------------------------+--------------------------------+
| cp500 | EBCDIC-CP-BE, EBCDIC-CP-CH, | Western Europe |
| | IBM500 | |
+-----------------+--------------------------------+--------------------------------+
| cp720 | | Arabic |
+-----------------+--------------------------------+--------------------------------+
| cp737 | | Greek |
+-----------------+--------------------------------+--------------------------------+
| cp775 | IBM775 | Baltic languages |
+-----------------+--------------------------------+--------------------------------+
| cp850 | 850, IBM850 | Western Europe |
+-----------------+--------------------------------+--------------------------------+
| cp852 | 852, IBM852 | Central and Eastern Europe |
+-----------------+--------------------------------+--------------------------------+
| cp855 | 855, IBM855 | Bulgarian, Byelorussian, |
| | | Macedonian, Russian, Serbian |
+-----------------+--------------------------------+--------------------------------+
| cp856 | | Hebrew |
+-----------------+--------------------------------+--------------------------------+
| cp857 | 857, IBM857 | Turkish |
+-----------------+--------------------------------+--------------------------------+
| cp858 | 858, IBM858 | Western Europe |
+-----------------+--------------------------------+--------------------------------+
| cp860 | 860, IBM860 | Portuguese |
+-----------------+--------------------------------+--------------------------------+
| cp861 | 861, CP-IS, IBM861 | Icelandic |
+-----------------+--------------------------------+--------------------------------+
| cp862 | 862, IBM862 | Hebrew |
+-----------------+--------------------------------+--------------------------------+
| cp863 | 863, IBM863 | Canadian |
+-----------------+--------------------------------+--------------------------------+
| cp864 | IBM864 | Arabic |
+-----------------+--------------------------------+--------------------------------+
| cp865 | 865, IBM865 | Danish, Norwegian |
+-----------------+--------------------------------+--------------------------------+
| cp866 | 866, IBM866 | Russian |
+-----------------+--------------------------------+--------------------------------+
| cp869 | 869, CP-GR, IBM869 | Greek |
+-----------------+--------------------------------+--------------------------------+
| cp874 | | Thai |
+-----------------+--------------------------------+--------------------------------+
| cp875 | | Greek |
+-----------------+--------------------------------+--------------------------------+
| cp932 | 932, ms932, mskanji, ms-kanji | Japanese |
+-----------------+--------------------------------+--------------------------------+
| cp949 | 949, ms949, uhc | Korean |
+-----------------+--------------------------------+--------------------------------+
| cp950 | 950, ms950 | Traditional Chinese |
+-----------------+--------------------------------+--------------------------------+
| cp1006 | | Urdu |
+-----------------+--------------------------------+--------------------------------+
| cp1026 | ibm1026 | Turkish |
+-----------------+--------------------------------+--------------------------------+
| cp1140 | ibm1140 | Western Europe |
+-----------------+--------------------------------+--------------------------------+
| cp1250 | windows-1250 | Central and Eastern Europe |
+-----------------+--------------------------------+--------------------------------+
| cp1251 | windows-1251 | Bulgarian, Byelorussian, |
| | | Macedonian, Russian, Serbian |
+-----------------+--------------------------------+--------------------------------+
| cp1252 | windows-1252 | Western Europe |
+-----------------+--------------------------------+--------------------------------+
| cp1253 | windows-1253 | Greek |
+-----------------+--------------------------------+--------------------------------+
| cp1254 | windows-1254 | Turkish |
+-----------------+--------------------------------+--------------------------------+
| cp1255 | windows-1255 | Hebrew |
+-----------------+--------------------------------+--------------------------------+
| cp1256 | windows-1256 | Arabic |
+-----------------+--------------------------------+--------------------------------+
| cp1257 | windows-1257 | Baltic languages |
+-----------------+--------------------------------+--------------------------------+
| cp1258 | windows-1258 | Vietnamese |
+-----------------+--------------------------------+--------------------------------+
| euc_jp | eucjp, ujis, u-jis | Japanese |
+-----------------+--------------------------------+--------------------------------+
| euc_jis_2004 | jisx0213, eucjis2004 | Japanese |
+-----------------+--------------------------------+--------------------------------+
| euc_jisx0213 | eucjisx0213 | Japanese |
+-----------------+--------------------------------+--------------------------------+
| euc_kr | euckr, korean, ksc5601, | Korean |
| | ks_c-5601, ks_c-5601-1987, | |
| | ksx1001, ks_x-1001 | |
+-----------------+--------------------------------+--------------------------------+
| gb2312 | chinese, csiso58gb231280, euc- | Simplified Chinese |
| | cn, euccn, eucgb2312-cn, | |
| | gb2312-1980, gb2312-80, iso- | |
| | ir-58 | |
+-----------------+--------------------------------+--------------------------------+
| gbk | 936, cp936, ms936 | Unified Chinese |
+-----------------+--------------------------------+--------------------------------+
| gb18030 | gb18030-2000 | Unified Chinese |
+-----------------+--------------------------------+--------------------------------+
| hz | hzgb, hz-gb, hz-gb-2312 | Simplified Chinese |
+-----------------+--------------------------------+--------------------------------+
| iso2022_jp | csiso2022jp, iso2022jp, | Japanese |
| | iso-2022-jp | |
+-----------------+--------------------------------+--------------------------------+
| iso2022_jp_1 | iso2022jp-1, iso-2022-jp-1 | Japanese |
+-----------------+--------------------------------+--------------------------------+
| iso2022_jp_2 | iso2022jp-2, iso-2022-jp-2 | Japanese, Korean, Simplified |
| | | Chinese, Western Europe, Greek |
+-----------------+--------------------------------+--------------------------------+
| iso2022_jp_2004 | iso2022jp-2004, | Japanese |
| | iso-2022-jp-2004 | |
+-----------------+--------------------------------+--------------------------------+
| iso2022_jp_3 | iso2022jp-3, iso-2022-jp-3 | Japanese |
+-----------------+--------------------------------+--------------------------------+
| iso2022_jp_ext | iso2022jp-ext, iso-2022-jp-ext | Japanese |
+-----------------+--------------------------------+--------------------------------+
| iso2022_kr | csiso2022kr, iso2022kr, | Korean |
| | iso-2022-kr | |
+-----------------+--------------------------------+--------------------------------+
| latin_1 | iso-8859-1, iso8859-1, 8859, | West Europe |
| | cp819, latin, latin1, L1 | |
+-----------------+--------------------------------+--------------------------------+
| iso8859_2 | iso-8859-2, latin2, L2 | Central and Eastern Europe |
+-----------------+--------------------------------+--------------------------------+
| iso8859_3 | iso-8859-3, latin3, L3 | Esperanto, Maltese |
+-----------------+--------------------------------+--------------------------------+
| iso8859_4 | iso-8859-4, latin4, L4 | Baltic languages |
+-----------------+--------------------------------+--------------------------------+
| iso8859_5 | iso-8859-5, cyrillic | Bulgarian, Byelorussian, |
| | | Macedonian, Russian, Serbian |
+-----------------+--------------------------------+--------------------------------+
| iso8859_6 | iso-8859-6, arabic | Arabic |
+-----------------+--------------------------------+--------------------------------+
| iso8859_7 | iso-8859-7, greek, greek8 | Greek |
+-----------------+--------------------------------+--------------------------------+
| iso8859_8 | iso-8859-8, hebrew | Hebrew |
+-----------------+--------------------------------+--------------------------------+
| iso8859_9 | iso-8859-9, latin5, L5 | Turkish |
+-----------------+--------------------------------+--------------------------------+
| iso8859_10 | iso-8859-10, latin6, L6 | Nordic languages |
+-----------------+--------------------------------+--------------------------------+
| iso8859_13 | iso-8859-13, latin7, L7 | Baltic languages |
+-----------------+--------------------------------+--------------------------------+
| iso8859_14 | iso-8859-14, latin8, L8 | Celtic languages |
+-----------------+--------------------------------+--------------------------------+
| iso8859_15 | iso-8859-15, latin9, L9 | Western Europe |
+-----------------+--------------------------------+--------------------------------+
| iso8859_16 | iso-8859-16, latin10, L10 | South-Eastern Europe |
+-----------------+--------------------------------+--------------------------------+
| johab | cp1361, ms1361 | Korean |
+-----------------+--------------------------------+--------------------------------+
| koi8_r | | Russian |
+-----------------+--------------------------------+--------------------------------+
| koi8_u | | Ukrainian |
+-----------------+--------------------------------+--------------------------------+
| mac_cyrillic | maccyrillic | Bulgarian, Byelorussian, |
| | | Macedonian, Russian, Serbian |
+-----------------+--------------------------------+--------------------------------+
| mac_greek | macgreek | Greek |
+-----------------+--------------------------------+--------------------------------+
| mac_iceland | maciceland | Icelandic |
+-----------------+--------------------------------+--------------------------------+
| mac_latin2 | maclatin2, maccentraleurope | Central and Eastern Europe |
+-----------------+--------------------------------+--------------------------------+
| mac_roman | macroman | Western Europe |
+-----------------+--------------------------------+--------------------------------+
| mac_turkish | macturkish | Turkish |
+-----------------+--------------------------------+--------------------------------+
| ptcp154 | csptcp154, pt154, cp154, | Kazakh |
| | cyrillic-asian | |
+-----------------+--------------------------------+--------------------------------+
| shift_jis | csshiftjis, shiftjis, sjis, | Japanese |
| | s_jis | |
+-----------------+--------------------------------+--------------------------------+
| shift_jis_2004 | shiftjis2004, sjis_2004, | Japanese |
| | sjis2004 | |
+-----------------+--------------------------------+--------------------------------+
| shift_jisx0213 | shiftjisx0213, sjisx0213, | Japanese |
| | s_jisx0213 | |
+-----------------+--------------------------------+--------------------------------+
| utf_32 | U32, utf32 | all languages |
+-----------------+--------------------------------+--------------------------------+
| utf_32_be | UTF-32BE | all languages |
+-----------------+--------------------------------+--------------------------------+
| utf_32_le | UTF-32LE | all languages |
+-----------------+--------------------------------+--------------------------------+
| utf_16 | U16, utf16 | all languages |
+-----------------+--------------------------------+--------------------------------+
| utf_16_be | UTF-16BE | all languages (BMP only) |
+-----------------+--------------------------------+--------------------------------+
| utf_16_le | UTF-16LE | all languages (BMP only) |
+-----------------+--------------------------------+--------------------------------+
| utf_7 | U7, unicode-1-1-utf-7 | all languages |
+-----------------+--------------------------------+--------------------------------+
| utf_8 | U8, UTF, utf8 | all languages |
+-----------------+--------------------------------+--------------------------------+
| utf_8_sig | | all languages |
+-----------------+--------------------------------+--------------------------------+
A number of codecs are specific to Python, so their codec names have no meaning
outside Python. Some of them don't convert from Unicode strings to byte strings,
but instead use the property of the Python codecs machinery that any bijective
function with one argument can be considered as an encoding.
For the codecs listed below, the result in the "encoding" direction is always a
byte string. The result of the "decoding" direction is listed as operand type in
the table.
+--------------------+---------------------------+----------------+---------------------------+
| Codec | Aliases | Operand type | Purpose |
+====================+===========================+================+===========================+
| base64_codec | base64, base-64 | byte string | Convert operand to MIME |
| | | | base64 |
+--------------------+---------------------------+----------------+---------------------------+
| bz2_codec | bz2 | byte string | Compress the operand |
| | | | using bz2 |
+--------------------+---------------------------+----------------+---------------------------+
| hex_codec | hex | byte string | Convert operand to |
| | | | hexadecimal |
| | | | representation, with two |
| | | | digits per byte |
+--------------------+---------------------------+----------------+---------------------------+
| idna | | Unicode string | Implements 3490, |
| | | | see also |
| | | | encodings.idna (|py2stdlib-encodings.idna|) |
+--------------------+---------------------------+----------------+---------------------------+
| mbcs | dbcs | Unicode string | Windows only: Encode |
| | | | operand according to the |
| | | | ANSI codepage (CP_ACP) |
+--------------------+---------------------------+----------------+---------------------------+
| palmos | | Unicode string | Encoding of PalmOS 3.5 |
+--------------------+---------------------------+----------------+---------------------------+
| punycode | | Unicode string | Implements 3492 |
+--------------------+---------------------------+----------------+---------------------------+
| quopri_codec | quopri, quoted-printable, | byte string | Convert operand to MIME |
| | quotedprintable | | quoted printable |
+--------------------+---------------------------+----------------+---------------------------+
| raw_unicode_escape | | Unicode string | Produce a string that is |
| | | | suitable as raw Unicode |
| | | | literal in Python source |
| | | | code |
+--------------------+---------------------------+----------------+---------------------------+
| rot_13 | rot13 | Unicode string | Returns the Caesar-cypher |
| | | | encryption of the operand |
+--------------------+---------------------------+----------------+---------------------------+
| string_escape | | byte string | Produce a string that is |
| | | | suitable as string |
| | | | literal in Python source |
| | | | code |
+--------------------+---------------------------+----------------+---------------------------+
| undefined | | any | Raise an exception for |
| | | | all conversions. Can be |
| | | | used as the system |
| | | | encoding if no automatic |
| | | | coercion between |
| | | | byte and Unicode strings |
| | | | is desired. |
+--------------------+---------------------------+----------------+---------------------------+
| unicode_escape | | Unicode string | Produce a string that is |
| | | | suitable as Unicode |
| | | | literal in Python source |
| | | | code |
+--------------------+---------------------------+----------------+---------------------------+
| unicode_internal | | Unicode string | Return the internal |
| | | | representation of the |
| | | | operand |
+--------------------+---------------------------+----------------+---------------------------+
| uu_codec | uu | byte string | Convert the operand using |
| | | | uuencode |
+--------------------+---------------------------+----------------+---------------------------+
| zlib_codec | zip, zlib | byte string | Compress the operand |
| | | | using gzip |
+--------------------+---------------------------+----------------+---------------------------+
.. versionadded:: 2.3
The ``idna`` and ``punycode`` encodings.
encodings.idna (|py2stdlib-encodings.idna|) --- Internationalized Domain Names in Applications
------------------------------------------------------------------------
==============================================================================
*py2stdlib-codeop*
codeop~
:synopsis: Compile (possibly incomplete) Python code.
The codeop (|py2stdlib-codeop|) module provides utilities upon which the Python
read-eval-print loop can be emulated, as is done in the code (|py2stdlib-code|) module. As
a result, you probably don't want to use the module directly; if you want to
include such a loop in your program you probably want to use the code (|py2stdlib-code|)
module instead.
There are two parts to this job:
#. Being able to tell if a line of input completes a Python statement: in
short, telling whether to print '``>>>``' or '``...``' next.
#. Remembering which future statements the user has entered, so subsequent
input can be compiled with these in effect.
The codeop (|py2stdlib-codeop|) module provides a way of doing each of these things, and a way
of doing them both.
To do just the former:
compile_command(source[, filename[, symbol]])~
Tries to compile {source}, which should be a string of Python code and return a
code object if {source} is valid Python code. In that case, the filename
attribute of the code object will be {filename}, which defaults to
``'<input>'``. Returns ``None`` if {source} is {not} valid Python code, but is a
prefix of valid Python code.
If there is a problem with {source}, an exception will be raised.
SyntaxError is raised if there is invalid Python syntax, and
OverflowError or ValueError if there is an invalid literal.
The {symbol} argument determines whether {source} is compiled as a statement
(``'single'``, the default) or as an expression (``'eval'``). Any
other value will cause ValueError to be raised.
.. note:: >
It is possible (but not likely) that the parser stops parsing with a
successful outcome before reaching the end of the source; in this case,
trailing symbols may be ignored instead of causing an error. For example,
a backslash followed by two newlines may be followed by arbitrary garbage.
This will be fixed once the API for the parser is better.
<
Compile()~
Instances of this class have __call__ methods identical in signature to
the built-in function compile, but with the difference that if the
instance compiles program text containing a __future__ (|py2stdlib-__future__|) statement, the
instance 'remembers' and compiles all subsequent program texts with the
statement in force.
CommandCompiler()~
Instances of this class have __call__ methods identical in signature to
compile_command; the difference is that if the instance compiles program
text containing a ``__future__`` statement, the instance 'remembers' and
compiles all subsequent program texts with the statement in force.
A note on version compatibility: the Compile and
CommandCompiler are new in Python 2.2. If you want to enable the
future-tracking features of 2.2 but also retain compatibility with 2.1 and
earlier versions of Python you can either write :: >
try:
from codeop import CommandCompiler
compile_command = CommandCompiler()
del CommandCompiler
except ImportError:
from codeop import compile_command
<
which is a low-impact change, but introduces possibly unwanted global state into
your program, or you can write:: >
try:
from codeop import CommandCompiler
except ImportError:
def CommandCompiler():
from codeop import compile_command
return compile_command
<
and then call ``CommandCompiler`` every time you need a fresh compiler object.
==============================================================================
*py2stdlib-collections*
collections~
:synopsis: High-performance datatypes
.. versionadded:: 2.4
.. testsetup:: *
from collections import *
import itertools
__name__ = '<doctest>'
This module implements high-performance container datatypes. Currently,
there are four datatypes, Counter, deque, OrderedDict and
defaultdict, and one datatype factory function, namedtuple.
The specialized containers provided in this module provide alternatives
to Python's general purpose built-in containers, dict,
list, set, and tuple.
.. versionchanged:: 2.4
Added deque.
.. versionchanged:: 2.5
Added defaultdict.
.. versionchanged:: 2.6
Added namedtuple and added abstract base classes.
.. versionchanged:: 2.7
Added Counter and OrderedDict.
In addition to containers, the collections module provides some ABCs
(abstract base classes) that can be used to test whether a class
provides a particular interface, for example, whether it is hashable or
a mapping.
ABCs - abstract base classes
----------------------------
The collections module offers the following ABCs:
========================= ===================== ====================== ====================================================
ABC Inherits Abstract Methods Mixin Methods
========================= ===================== ====================== ====================================================
Container ``__contains__``
Hashable ``__hash__``
Iterable ``__iter__``
Iterator Iterable ``next`` ``__iter__``
Sized ``__len__``
Callable ``__call__``
Sequence Sized, ``__getitem__`` ``__contains__``. ``__iter__``, ``__reversed__``.
Iterable, ``index``, and ``count``
Container
MutableSequence Sequence ``__setitem__`` Inherited Sequence methods and
``__delitem__``, ``append``, ``reverse``, ``extend``, ``pop``,
and ``insert`` ``remove``, and ``__iadd__``
Set Sized, ``__le__``, ``__lt__``, ``__eq__``, ``__ne__``,
Iterable, ``__gt__``, ``__ge__``, ``__and__``, ``__or__``
Container ``__sub__``, ``__xor__``, and ``isdisjoint``
MutableSet Set ``add`` and Inherited Set methods and
``discard`` ``clear``, ``pop``, ``remove``, ``__ior__``,
``__iand__``, ``__ixor__``, and ``__isub__``
Mapping Sized, ``__getitem__`` ``__contains__``, ``keys``, ``items``, ``values``,
Iterable, ``get``, ``__eq__``, and ``__ne__``
Container
MutableMapping Mapping ``__setitem__`` and Inherited Mapping methods and
``__delitem__`` ``pop``, ``popitem``, ``clear``, ``update``,
and ``setdefault``
MappingView Sized ``__len__``
KeysView MappingView, ``__contains__``,
Set ``__iter__``
ItemsView MappingView, ``__contains__``,
Set ``__iter__``
ValuesView MappingView ``__contains__``, ``__iter__``
========================= ===================== ====================== ====================================================
These ABCs allow us to ask classes or instances if they provide
particular functionality, for example:: >
size = None
if isinstance(myvar, collections.Sized):
size = len(myvar)
<
Several of the ABCs are also useful as mixins that make it easier to develop
classes supporting container APIs. For example, to write a class supporting
the full Set API, it only necessary to supply the three underlying
abstract methods: __contains__, __iter__, and __len__.
The ABC supplies the remaining methods such as __and__ and
isdisjoint :: >
class ListBasedSet(collections.Set):
''' Alternate set implementation favoring space over speed
and not requiring the set elements to be hashable. '''
def __init__(self, iterable):
self.elements = lst = []
for value in iterable:
if value not in lst:
lst.append(value)
def __iter__(self):
return iter(self.elements)
def __contains__(self, value):
return value in self.elements
def __len__(self):
return len(self.elements)
s1 = ListBasedSet('abcdef')
s2 = ListBasedSet('defghi')
overlap = s1 & s2 # The __and__() method is supported automatically
<
Notes on using Set and MutableSet as a mixin:
(1)
Since some set operations create new sets, the default mixin methods need
a way to create new instances from an iterable. The class constructor is
assumed to have a signature in the form ``ClassName(iterable)``.
That assumption is factored-out to an internal classmethod called
_from_iterable which calls ``cls(iterable)`` to produce a new set.
If the Set mixin is being used in a class with a different
constructor signature, you will need to override from_iterable
with a classmethod that can construct new instances from
an iterable argument.
(2)
To override the comparisons (presumably for speed, as the
semantics are fixed), redefine __le__ and
then the other operations will automatically follow suit.
(3)
The Set mixin provides a _hash method to compute a hash value
for the set; however, __hash__ is not defined because not all sets
are hashable or immutable. To add set hashabilty using mixins,
inherit from both Set and Hashable, then define
``__hash__ = Set._hash``.
.. seealso::
* `OrderedSet recipe <http://code.activestate.com/recipes/576694/>`_ for an
example built on MutableSet.
* For more about ABCs, see the abc (|py2stdlib-abc|) module and 3119.
Counter objects
------------------------
A counter tool is provided to support convenient and rapid tallies.
For example:: >
>>> # Tally occurrences of words in a list
>>> cnt = Counter()
>>> for word in ['red', 'blue', 'red', 'green', 'blue', 'blue']:
... cnt[word] += 1
>>> cnt
Counter({'blue': 3, 'red': 2, 'green': 1})
>>> # Find the ten most common words in Hamlet
>>> import re
>>> words = re.findall('\w+', open('hamlet.txt').read().lower())
>>> Counter(words).most_common(10)
[('the', 1143), ('and', 966), ('to', 762), ('of', 669), ('i', 631),
('you', 554), ('a', 546), ('my', 514), ('hamlet', 471), ('in', 451)]
<
Counter([iterable-or-mapping])~
A Counter is a dict subclass for counting hashable objects.
It is an unordered collection where elements are stored as dictionary keys
and their counts are stored as dictionary values. Counts are allowed to be
any integer value including zero or negative counts. The Counter
class is similar to bags or multisets in other languages.
Elements are counted from an {iterable} or initialized from another
{mapping} (or counter):
>>> c = Counter() # a new, empty counter
>>> c = Counter('gallahad') # a new counter from an iterable
>>> c = Counter({'red': 4, 'blue': 2}) # a new counter from a mapping
>>> c = Counter(cats=4, dogs=8) # a new counter from keyword args
Counter objects have a dictionary interface except that they return a zero
count for missing items instead of raising a KeyError:
>>> c = Counter(['eggs', 'ham'])
>>> c['bacon'] # count of a missing element is zero
0
Setting a count to zero does not remove an element from a counter.
Use ``del`` to remove it entirely:
>>> c['sausage'] = 0 # counter entry with a zero count
>>> del c['sausage'] # del actually removes the entry
.. versionadded:: 2.7
Counter objects support three methods beyond those available for all
dictionaries:
elements()~
Return an iterator over elements repeating each as many times as its
count. Elements are returned in arbitrary order. If an element's count
is less than one, elements will ignore it.
>>> c = Counter(a=4, b=2, c=0, d=-2)
>>> list(c.elements())
['a', 'a', 'a', 'a', 'b', 'b']
most_common([n])~
Return a list of the {n} most common elements and their counts from the
most common to the least. If {n} is not specified, most_common
returns {all} elements in the counter. Elements with equal counts are
ordered arbitrarily:
>>> Counter('abracadabra').most_common(3)
[('a', 5), ('r', 2), ('b', 2)]
subtract([iterable-or-mapping])~
Elements are subtracted from an {iterable} or from another {mapping}
(or counter). Like dict.update but subtracts counts instead
of replacing them. Both inputs and outputs may be zero or negative.
>>> c = Counter(a=4, b=2, c=0, d=-2)
>>> d = Counter(a=1, b=2, c=3, d=4)
>>> c.subtract(d)
Counter({'a': 3, 'b': 0, 'c': -3, 'd': -6})
The usual dictionary methods are available for Counter objects
except for two which work differently for counters.
fromkeys(iterable)~
This class method is not implemented for Counter objects.
update([iterable-or-mapping])~
Elements are counted from an {iterable} or added-in from another
{mapping} (or counter). Like dict.update but adds counts
instead of replacing them. Also, the {iterable} is expected to be a
sequence of elements, not a sequence of ``(key, value)`` pairs.
Common patterns for working with Counter objects:: >
sum(c.values()) # total of all counts
c.clear() # reset all counts
list(c) # list unique elements
set(c) # convert to a set
dict(c) # convert to a regular dictionary
c.items() # convert to a list of (elem, cnt) pairs
Counter(dict(list_of_pairs)) # convert from a list of (elem, cnt) pairs
c.most_common()[:-n:-1] # n least common elements
c += Counter() # remove zero and negative counts
<
Several mathematical operations are provided for combining Counter
objects to produce multisets (counters that have counts greater than zero).
Addition and subtraction combine counters by adding or subtracting the counts
of corresponding elements. Intersection and union return the minimum and
maximum of corresponding counts. Each operation can accept inputs with signed
counts, but the output will exclude results with counts of zero or less.
>>> c = Counter(a=3, b=1)
>>> d = Counter(a=1, b=2)
>>> c + d # add two counters together: c[x] + d[x]
Counter({'a': 4, 'b': 3})
>>> c - d # subtract (keeping only positive counts)
Counter({'a': 2})
>>> c & d # intersection: min(c[x], d[x])
Counter({'a': 1, 'b': 1})
>>> c | d # union: max(c[x], d[x])
Counter({'a': 3, 'b': 2})
.. note::
Counters were primarily designed to work with positive integers to represent
running counts; however, care was taken to not unnecessarily preclude use
cases needing other types or negative values. To help with those use cases,
this section documents the minimum range and type restrictions.
* The Counter class itself is a dictionary subclass with no
restrictions on its keys and values. The values are intended to be numbers
representing counts, but you {could} store anything in the value field.
* The most_common method requires only that the values be orderable.
* For in-place operations such as ``c[key] += 1``, the value type need only
support addition and subtraction. So fractions, floats, and decimals would
work and negative values are supported. The same is also true for
update and subtract which allow negative and zero values
for both inputs and outputs.
* The multiset methods are designed only for use cases with positive values.
The inputs may be negative or zero, but only outputs with positive values
are created. There are no type restrictions, but the value type needs to
support support addition, subtraction, and comparison.
* The elements method requires integer counts. It ignores zero and
negative counts.
.. seealso::
* `Counter class <http://code.activestate.com/recipes/576611/>`_
adapted for Python 2.5 and an early `Bag recipe
<http://code.activestate.com/recipes/259174/>`_ for Python 2.4.
* `Bag class <http://www.gnu.org/software/smalltalk/manual-base/html_node/Bag.html>`_
in Smalltalk.
* Wikipedia entry for `Multisets <http://en.wikipedia.org/wiki/Multiset>`_\.
* `C++ multisets <http://www.demo2s.com/Tutorial/Cpp/0380__set-multiset/Catalog0380__set-multiset.htm>`_
tutorial with examples.
* For mathematical operations on multisets and their use cases, see
*Knuth, Donald. The Art of Computer Programming Volume II,
Section 4.6.3, Exercise 19*\.
* To enumerate all distinct multisets of a given size over a given set of
elements, see itertools.combinations_with_replacement.
map(Counter, combinations_with_replacement('ABC', 2)) --> AA AB AC BB BC CC
deque objects
----------------------
deque([iterable[, maxlen]])~
Returns a new deque object initialized left-to-right (using append) with
data from {iterable}. If {iterable} is not specified, the new deque is empty.
Deques are a generalization of stacks and queues (the name is pronounced "deck"
and is short for "double-ended queue"). Deques support thread-safe, memory
efficient appends and pops from either side of the deque with approximately the
same O(1) performance in either direction.
Though list objects support similar operations, they are optimized for
fast fixed-length operations and incur O(n) memory movement costs for
``pop(0)`` and ``insert(0, v)`` operations which change both the size and
position of the underlying data representation.
.. versionadded:: 2.4
If {maxlen} is not specified or is {None}, deques may grow to an
arbitrary length. Otherwise, the deque is bounded to the specified maximum
length. Once a bounded length deque is full, when new items are added, a
corresponding number of items are discarded from the opposite end. Bounded
length deques provide functionality similar to the ``tail`` filter in
Unix. They are also useful for tracking transactions and other pools of data
where only the most recent activity is of interest.
.. versionchanged:: 2.6
Added {maxlen} parameter.
Deque objects support the following methods:
append(x)~
Add {x} to the right side of the deque.
appendleft(x)~
Add {x} to the left side of the deque.
clear()~
Remove all elements from the deque leaving it with length 0.
count(x)~
Count the number of deque elements equal to {x}.
.. versionadded:: 2.7
extend(iterable)~
Extend the right side of the deque by appending elements from the iterable
argument.
extendleft(iterable)~
Extend the left side of the deque by appending elements from {iterable}.
Note, the series of left appends results in reversing the order of
elements in the iterable argument.
pop()~
Remove and return an element from the right side of the deque. If no
elements are present, raises an IndexError.
popleft()~
Remove and return an element from the left side of the deque. If no
elements are present, raises an IndexError.
remove(value)~
Removed the first occurrence of {value}. If not found, raises a
ValueError.
.. versionadded:: 2.5
reverse()~
Reverse the elements of the deque in-place and then return ``None``.
.. versionadded:: 2.7
rotate(n)~
Rotate the deque {n} steps to the right. If {n} is negative, rotate to
the left. Rotating one step to the right is equivalent to:
``d.appendleft(d.pop())``.
Deque objects also provide one read-only attribute:
maxlen~
Maximum size of a deque or {None} if unbounded.
.. versionadded:: 2.7
In addition to the above, deques support iteration, pickling, ``len(d)``,
``reversed(d)``, ``copy.copy(d)``, ``copy.deepcopy(d)``, membership testing with
the in operator, and subscript references such as ``d[-1]``. Indexed
access is O(1) at both ends but slows to O(n) in the middle. For fast random
access, use lists instead.
Example:
.. doctest::
>>> from collections import deque
>>> d = deque('ghi') # make a new deque with three items
>>> for elem in d: # iterate over the deque's elements
... print elem.upper()
G
H
I
>>> d.append('j') # add a new entry to the right side
>>> d.appendleft('f') # add a new entry to the left side
>>> d # show the representation of the deque
deque(['f', 'g', 'h', 'i', 'j'])
>>> d.pop() # return and remove the rightmost item
'j'
>>> d.popleft() # return and remove the leftmost item
'f'
>>> list(d) # list the contents of the deque
['g', 'h', 'i']
>>> d[0] # peek at leftmost item
'g'
>>> d[-1] # peek at rightmost item
'i'
>>> list(reversed(d)) # list the contents of a deque in reverse
['i', 'h', 'g']
>>> 'h' in d # search the deque
True
>>> d.extend('jkl') # add multiple elements at once
>>> d
deque(['g', 'h', 'i', 'j', 'k', 'l'])
>>> d.rotate(1) # right rotation
>>> d
deque(['l', 'g', 'h', 'i', 'j', 'k'])
>>> d.rotate(-1) # left rotation
>>> d
deque(['g', 'h', 'i', 'j', 'k', 'l'])
>>> deque(reversed(d)) # make a new deque in reverse order
deque(['l', 'k', 'j', 'i', 'h', 'g'])
>>> d.clear() # empty the deque
>>> d.pop() # cannot pop from an empty deque
Traceback (most recent call last):
File "<pyshell#6>", line 1, in -toplevel-
d.pop()
IndexError: pop from an empty deque
>>> d.extendleft('abc') # extendleft() reverses the input order
>>> d
deque(['c', 'b', 'a'])
deque Recipes
^^^^^^^^^^^^^^^^^^^^^^
This section shows various approaches to working with deques.
Bounded length deques provide functionality similar to the ``tail`` filter
in Unix:: >
def tail(filename, n=10):
'Return the last n lines of a file'
return deque(open(filename), n)
<
Another approach to using deques is to maintain a sequence of recently
added elements by appending to the right and popping to the left:: >
def moving_average(iterable, n=3):
# moving_average([40, 30, 50, 46, 39, 44]) --> 40.0 42.0 45.0 43.0
# http://en.wikipedia.org/wiki/Moving_average
it = iter(iterable)
d = deque(itertools.islice(it, n-1))
d.appendleft(0)
s = sum(d)
for elem in it:
s += elem - d.popleft()
d.append(elem)
yield s / float(n)
<
The rotate method provides a way to implement deque slicing and
deletion. For example, a pure Python implementation of ``del d[n]`` relies on
the rotate method to position elements to be popped:: >
def delete_nth(d, n):
d.rotate(-n)
d.popleft()
d.rotate(n)
<
To implement deque slicing, use a similar approach applying
rotate to bring a target element to the left side of the deque. Remove
old entries with popleft, add new entries with extend, and then
reverse the rotation.
With minor variations on that approach, it is easy to implement Forth style
stack manipulations such as ``dup``, ``drop``, ``swap``, ``over``, ``pick``,
``rot``, and ``roll``.
defaultdict objects
----------------------------
defaultdict([default_factory[, ...]])~
Returns a new dictionary-like object. defaultdict is a subclass of the
built-in dict class. It overrides one method and adds one writable
instance variable. The remaining functionality is the same as for the
dict class and is not documented here.
The first argument provides the initial value for the default_factory
attribute; it defaults to ``None``. All remaining arguments are treated the same
as if they were passed to the dict constructor, including keyword
arguments.
.. versionadded:: 2.5
defaultdict objects support the following method in addition to the
standard dict operations:
defaultdict.__missing__(key)~
If the default_factory attribute is ``None``, this raises a
KeyError exception with the {key} as argument.
If default_factory is not ``None``, it is called without arguments
to provide a default value for the given {key}, this value is inserted in
the dictionary for the {key}, and returned.
If calling default_factory raises an exception this exception is
propagated unchanged.
This method is called by the __getitem__ method of the
dict class when the requested key is not found; whatever it
returns or raises is then returned or raised by __getitem__.
defaultdict objects support the following instance variable:
defaultdict.default_factory~
This attribute is used by the __missing__ method; it is
initialized from the first argument to the constructor, if present, or to
``None``, if absent.
defaultdict Examples
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Using list as the default_factory, it is easy to group a
sequence of key-value pairs into a dictionary of lists:
>>> s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
>>> d = defaultdict(list)
>>> for k, v in s:
... d[k].append(v)
...
>>> d.items()
[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
When each key is encountered for the first time, it is not already in the
mapping; so an entry is automatically created using the default_factory
function which returns an empty list. The list.append
operation then attaches the value to the new list. When keys are encountered
again, the look-up proceeds normally (returning the list for that key) and the
list.append operation adds another value to the list. This technique is
simpler and faster than an equivalent technique using dict.setdefault:
>>> d = {}
>>> for k, v in s:
... d.setdefault(k, []).append(v)
...
>>> d.items()
[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
Setting the default_factory to int makes the
defaultdict useful for counting (like a bag or multiset in other
languages):
>>> s = 'mississippi'
>>> d = defaultdict(int)
>>> for k in s:
... d[k] += 1
...
>>> d.items()
[('i', 4), ('p', 2), ('s', 4), ('m', 1)]
When a letter is first encountered, it is missing from the mapping, so the
default_factory function calls int to supply a default count of
zero. The increment operation then builds up the count for each letter.
The function int which always returns zero is just a special case of
constant functions. A faster and more flexible way to create constant functions
is to use itertools.repeat which can supply any constant value (not just
zero):
>>> def constant_factory(value):
... return itertools.repeat(value).next
>>> d = defaultdict(constant_factory('<missing>'))
>>> d.update(name='John', action='ran')
>>> '%(name)s %(action)s to %(object)s' % d
'John ran to <missing>'
Setting the default_factory to set makes the
defaultdict useful for building a dictionary of sets:
>>> s = [('red', 1), ('blue', 2), ('red', 3), ('blue', 4), ('red', 1), ('blue', 4)]
>>> d = defaultdict(set)
>>> for k, v in s:
... d[k].add(v)
...
>>> d.items()
[('blue', set([2, 4])), ('red', set([1, 3]))]
namedtuple Factory Function for Tuples with Named Fields
----------------------------------------------------------------
Named tuples assign meaning to each position in a tuple and allow for more readable,
self-documenting code. They can be used wherever regular tuples are used, and
they add the ability to access fields by name instead of position index.
namedtuple(typename, field_names, [verbose], [rename])~
Returns a new tuple subclass named {typename}. The new subclass is used to
create tuple-like objects that have fields accessible by attribute lookup as
well as being indexable and iterable. Instances of the subclass also have a
helpful docstring (with typename and field_names) and a helpful __repr__
method which lists the tuple contents in a ``name=value`` format.
The {field_names} are a single string with each fieldname separated by whitespace
and/or commas, for example ``'x y'`` or ``'x, y'``. Alternatively, {field_names}
can be a sequence of strings such as ``['x', 'y']``.
Any valid Python identifier may be used for a fieldname except for names
starting with an underscore. Valid identifiers consist of letters, digits,
and underscores but do not start with a digit or underscore and cannot be
a keyword (|py2stdlib-keyword|) such as {class}, {for}, {return}, {global}, {pass}, {print},
or {raise}.
If {rename} is true, invalid fieldnames are automatically replaced
with positional names. For example, ``['abc', 'def', 'ghi', 'abc']`` is
converted to ``['abc', '_1', 'ghi', '_3']``, eliminating the keyword
``def`` and the duplicate fieldname ``abc``.
If {verbose} is true, the class definition is printed just before being built.
Named tuple instances do not have per-instance dictionaries, so they are
lightweight and require no more memory than regular tuples.
.. versionadded:: 2.6
.. versionchanged:: 2.7
added support for {rename}.
Example:
.. doctest::
:options: +NORMALIZE_WHITESPACE
>>> Point = namedtuple('Point', 'x y', verbose=True)
class Point(tuple):
'Point(x, y)'
<BLANKLINE>
__slots__ = ()
<BLANKLINE>
_fields = ('x', 'y')
<BLANKLINE>
def __new__(_cls, x, y):
'Create a new instance of Point(x, y)'
return _tuple.__new__(_cls, (x, y))
<BLANKLINE>
@classmethod
def _make(cls, iterable, new=tuple.__new__, len=len):
'Make a new Point object from a sequence or iterable'
result = new(cls, iterable)
if len(result) != 2:
raise TypeError('Expected 2 arguments, got %d' % len(result))
return result
<BLANKLINE>
def __repr__(self):
'Return a nicely formatted representation string'
return 'Point(x=%r, y=%r)' % self
<BLANKLINE>
def _asdict(self):
'Return a new OrderedDict which maps field names to their values'
return OrderedDict(zip(self._fields, self))
<BLANKLINE>
def _replace(_self, {}kwds):
'Return a new Point object replacing specified fields with new values'
result = _self._make(map(kwds.pop, ('x', 'y'), _self))
if kwds:
raise ValueError('Got unexpected field names: %r' % kwds.keys())
return result
<BLANKLINE>
def __getnewargs__(self):
'Return self as a plain tuple. Used by copy and pickle.'
return tuple(self)
<BLANKLINE>
x = _property(_itemgetter(0), doc='Alias for field number 0')
y = _property(_itemgetter(1), doc='Alias for field number 1')
>>> p = Point(11, y=22) # instantiate with positional or keyword arguments
>>> p[0] + p[1] # indexable like the plain tuple (11, 22)
33
>>> x, y = p # unpack like a regular tuple
>>> x, y
(11, 22)
>>> p.x + p.y # fields also accessible by name
33
>>> p # readable __repr__ with a name=value style
Point(x=11, y=22)
Named tuples are especially useful for assigning field names to result tuples returned
by the csv (|py2stdlib-csv|) or sqlite3 (|py2stdlib-sqlite3|) modules:: >
EmployeeRecord = namedtuple('EmployeeRecord', 'name, age, title, department, paygrade')
import csv
for emp in map(EmployeeRecord._make, csv.reader(open("employees.csv", "rb"))):
print emp.name, emp.title
import sqlite3
conn = sqlite3.connect('/companydata')
cursor = conn.cursor()
cursor.execute('SELECT name, age, title, department, paygrade FROM employees')
for emp in map(EmployeeRecord._make, cursor.fetchall()):
print emp.name, emp.title
<
In addition to the methods inherited from tuples, named tuples support
three additional methods and one attribute. To prevent conflicts with
field names, the method and attribute names start with an underscore.
somenamedtuple._make(iterable)~
Class method that makes a new instance from an existing sequence or iterable.
.. doctest:: >
>>> t = [11, 22]
>>> Point._make(t)
Point(x=11, y=22)
<
somenamedtuple._asdict()~
Return a new OrderedDict which maps field names to their corresponding
values:: >
>>> p._asdict()
OrderedDict([('x', 11), ('y', 22)])
<
.. versionchanged:: 2.7
Returns an OrderedDict instead of a regular dict.
somenamedtuple._replace(kwargs)~
Return a new instance of the named tuple replacing specified fields with new
values:: >
>>> p = Point(x=11, y=22)
>>> p._replace(x=33)
Point(x=33, y=22)
>>> for partnum, record in inventory.items():
... inventory[partnum] = record._replace(price=newprices[partnum], timestamp=time.now())
<
somenamedtuple._fields~
Tuple of strings listing the field names. Useful for introspection
and for creating new named tuple types from existing named tuples.
.. doctest:: >
>>> p._fields # view the field names
('x', 'y')
>>> Color = namedtuple('Color', 'red green blue')
>>> Pixel = namedtuple('Pixel', Point._fields + Color._fields)
>>> Pixel(11, 22, 128, 255, 0)
Pixel(x=11, y=22, red=128, green=255, blue=0)
<
To retrieve a field whose name is stored in a string, use the getattr
function:
>>> getattr(p, 'x')
11
To convert a dictionary to a named tuple, use the double-star-operator
(as described in tut-unpacking-arguments):
>>> d = {'x': 11, 'y': 22}
>>> Point({}d)
Point(x=11, y=22)
Since a named tuple is a regular Python class, it is easy to add or change
functionality with a subclass. Here is how to add a calculated field and
a fixed-width print format:
>>> class Point(namedtuple('Point', 'x y')):
... __slots__ = ()
... @property
... def hypot(self):
... return (self.x { 2 + self.y }{ 2) }* 0.5
... def __str__(self):
... return 'Point: x=%6.3f y=%6.3f hypot=%6.3f' % (self.x, self.y, self.hypot)
>>> for p in Point(3, 4), Point(14, 5/7.):
... print p
Point: x= 3.000 y= 4.000 hypot= 5.000
Point: x=14.000 y= 0.714 hypot=14.018
The subclass shown above sets ``__slots__`` to an empty tuple. This helps
keep memory requirements low by preventing the creation of instance dictionaries.
Subclassing is not useful for adding new, stored fields. Instead, simply
create a new named tuple type from the _fields attribute:
>>> Point3D = namedtuple('Point3D', Point._fields + ('z',))
Default values can be implemented by using _replace to
customize a prototype instance:
>>> Account = namedtuple('Account', 'owner balance transaction_count')
>>> default_account = Account('<owner name>', 0.0, 0)
>>> johns_account = default_account._replace(owner='John')
Enumerated constants can be implemented with named tuples, but it is simpler
and more efficient to use a simple class declaration:
>>> Status = namedtuple('Status', 'open pending closed')._make(range(3))
>>> Status.open, Status.pending, Status.closed
(0, 1, 2)
>>> class Status:
... open, pending, closed = range(3)
.. seealso::
`Named tuple recipe <http://code.activestate.com/recipes/500261/>`_
adapted for Python 2.4.
OrderedDict objects
----------------------------
Ordered dictionaries are just like regular dictionaries but they remember the
order that items were inserted. When iterating over an ordered dictionary,
the items are returned in the order their keys were first added.
OrderedDict([items])~
Return an instance of a dict subclass, supporting the usual dict
methods. An {OrderedDict} is a dict that remembers the order that keys
were first inserted. If a new entry overwrites an existing entry, the
original insertion position is left unchanged. Deleting an entry and
reinserting it will move it to the end.
.. versionadded:: 2.7
OrderedDict.popitem(last=True)~
The popitem method for ordered dictionaries returns and removes
a (key, value) pair. The pairs are returned in LIFO order if {last} is
true or FIFO order if false.
In addition to the usual mapping methods, ordered dictionaries also support
reverse iteration using reversed.
Equality tests between OrderedDict objects are order-sensitive
and are implemented as ``list(od1.items())==list(od2.items())``.
Equality tests between OrderedDict objects and other
Mapping objects are order-insensitive like regular dictionaries.
This allows OrderedDict objects to be substituted anywhere a
regular dictionary is used.
The OrderedDict constructor and update method both accept
keyword arguments, but their order is lost because Python's function call
semantics pass-in keyword arguments using a regular unordered dictionary.
.. seealso::
`Equivalent OrderedDict recipe <http://code.activestate.com/recipes/576693/>`_
that runs on Python 2.4 or later.
Since an ordered dictionary remembers its insertion order, it can be used
in conjuction with sorting to make a sorted dictionary:: >
>>> # regular unsorted dictionary
>>> d = {'banana': 3, 'apple':4, 'pear': 1, 'orange': 2}
>>> # dictionary sorted by key
>>> OrderedDict(sorted(d.items(), key=lambda t: t[0]))
OrderedDict([('apple', 4), ('banana', 3), ('orange', 2), ('pear', 1)])
>>> # dictionary sorted by value
>>> OrderedDict(sorted(d.items(), key=lambda t: t[1]))
OrderedDict([('pear', 1), ('orange', 2), ('banana', 3), ('apple', 4)])
>>> # dictionary sorted by length of the key string
>>> OrderedDict(sorted(d.items(), key=lambda t: len(t[0])))
OrderedDict([('pear', 1), ('apple', 4), ('orange', 2), ('banana', 3)])
<
The new sorted dictionaries maintain their sort order when entries
are deleted. But when new keys are added, the keys are appended
to the end and the sort is not maintained.
==============================================================================
*py2stdlib-colorpicker*
ColorPicker~
:platform: Mac
:synopsis: Interface to the standard color selection dialog.
:deprecated:
The ColorPicker (|py2stdlib-colorpicker|) module provides access to the standard color picker
dialog.
.. note::
This module has been removed in Python 3.x.
GetColor(prompt, rgb)~
Show a standard color selection dialog and allow the user to select a color.
The user is given instruction by the {prompt} string, and the default color is
set to {rgb}. {rgb} must be a tuple giving the red, green, and blue components
of the color. GetColor returns a tuple giving the user's selected color
and a flag indicating whether they accepted the selection of cancelled.
==============================================================================
*py2stdlib-colorsys*
colorsys~
:synopsis: Conversion functions between RGB and other color systems.
The colorsys (|py2stdlib-colorsys|) module defines bidirectional conversions of color values
between colors expressed in the RGB (Red Green Blue) color space used in
computer monitors and three other coordinate systems: YIQ, HLS (Hue Lightness
Saturation) and HSV (Hue Saturation Value). Coordinates in all of these color
spaces are floating point values. In the YIQ space, the Y coordinate is between
0 and 1, but the I and Q coordinates can be positive or negative. In all other
spaces, the coordinates are all between 0 and 1.
.. seealso::
More information about color spaces can be found at
http://www.poynton.com/ColorFAQ.html and
http://www.cambridgeincolour.com/tutorials/color-spaces.htm.
The colorsys (|py2stdlib-colorsys|) module defines the following functions:
rgb_to_yiq(r, g, b)~
Convert the color from RGB coordinates to YIQ coordinates.
yiq_to_rgb(y, i, q)~
Convert the color from YIQ coordinates to RGB coordinates.
rgb_to_hls(r, g, b)~
Convert the color from RGB coordinates to HLS coordinates.
hls_to_rgb(h, l, s)~
Convert the color from HLS coordinates to RGB coordinates.
rgb_to_hsv(r, g, b)~
Convert the color from RGB coordinates to HSV coordinates.
hsv_to_rgb(h, s, v)~
Convert the color from HSV coordinates to RGB coordinates.
Example:: >
>>> import colorsys
>>> colorsys.rgb_to_hsv(.3, .4, .2)
(0.25, 0.5, 0.4)
>>> colorsys.hsv_to_rgb(0.25, 0.5, 0.4)
(0.3, 0.4, 0.2)
==============================================================================
*py2stdlib-commands*
commands~
:platform: Unix
:synopsis: Utility functions for running external commands.
:deprecated:
2.6~
The commands (|py2stdlib-commands|) module has been removed in Python 3.0. Use the
subprocess (|py2stdlib-subprocess|) module instead.
The commands (|py2stdlib-commands|) module contains wrapper functions for os.popen which
take a system command as a string and return any output generated by the command
and, optionally, the exit status.
The subprocess (|py2stdlib-subprocess|) module provides more powerful facilities for spawning new
processes and retrieving their results. Using the subprocess (|py2stdlib-subprocess|) module is
preferable to using the commands (|py2stdlib-commands|) module.
.. note::
In Python 3.x, getstatus and two undocumented functions
(mk2arg and mkarg) have been removed. Also,
getstatusoutput and getoutput have been moved to the
subprocess (|py2stdlib-subprocess|) module.
The commands (|py2stdlib-commands|) module defines the following functions:
getstatusoutput(cmd)~
Execute the string {cmd} in a shell with os.popen and return a 2-tuple
``(status, output)``. {cmd} is actually run as ``{ cmd ; } 2>&1``, so that the
returned output will contain output or error messages. A trailing newline is
stripped from the output. The exit status for the command can be interpreted
according to the rules for the C function wait.
getoutput(cmd)~
Like getstatusoutput, except the exit status is ignored and the return
value is a string containing the command's output.
getstatus(file)~
Return the output of ``ls -ld file`` as a string. This function uses the
getoutput function, and properly escapes backslashes and dollar signs in
the argument.
2.6~
This function is nonobvious and useless. The name is also misleading in the
presence of getstatusoutput.
Example:: >
>>> import commands
>>> commands.getstatusoutput('ls /bin/ls')
(0, '/bin/ls')
>>> commands.getstatusoutput('cat /bin/junk')
(256, 'cat: /bin/junk: No such file or directory')
>>> commands.getstatusoutput('/bin/junk')
(256, 'sh: /bin/junk: not found')
>>> commands.getoutput('ls /bin/ls')
'/bin/ls'
>>> commands.getstatus('/bin/ls')
'-rwxr-xr-x 1 root 13352 Oct 14 1994 /bin/ls'
<
.. seealso::
Module subprocess (|py2stdlib-subprocess|)
Module for spawning and managing subprocesses.
==============================================================================
*py2stdlib-compileall*
compileall~
:synopsis: Tools for byte-compiling all Python source files in a directory tree.
This module provides some utility functions to support installing Python
libraries. These functions compile Python source files in a directory tree,
allowing users without permission to write to the libraries to take advantage of
cached byte-code files.
This module may also be used as a script (using the -m Python flag) to
compile Python sources. Directories to recursively traverse (passing
-l stops the recursive behavior) for sources are listed on the command
line. If no arguments are given, the invocation is equivalent to ``-l
sys.path``. Printing lists of the files compiled can be disabled with the
-q flag. In addition, the -x option takes a regular
expression argument. All files that match the expression will be skipped.
compile_dir(dir[, maxlevels[, ddir[, force[, rx[, quiet]]]]])~
Recursively descend the directory tree named by {dir}, compiling all .py
files along the way. The {maxlevels} parameter is used to limit the depth of
the recursion; it defaults to ``10``. If {ddir} is given, it is used as the
base path from which the filenames used in error messages will be generated.
If {force} is true, modules are re-compiled even if the timestamps are up to
date.
If {rx} is given, it specifies a regular expression of file names to exclude
from the search; that expression is searched for in the full path.
If {quiet} is true, nothing is printed to the standard output in normal
operation.
compile_path([skip_curdir[, maxlevels[, force]]])~
Byte-compile all the .py files found along ``sys.path``. If
{skip_curdir} is true (the default), the current directory is not included in
the search. The {maxlevels} and {force} parameters default to ``0`` and are
passed to the compile_dir function.
To force a recompile of all the .py files in the Lib/
subdirectory and all its subdirectories:: >
import compileall
compileall.compile_dir('Lib/', force=True)
# Perform same compilation, excluding files in .svn directories.
import re
compileall.compile_dir('Lib/', rx=re.compile('/[.]svn'), force=True)
<
.. seealso::
Module py_compile (|py2stdlib-py_compile|)
Byte-compile a single source file.
==============================================================================
*py2stdlib-compiler*
compiler~
:synopsis: Python code compiler written in Python.
:deprecated:
The top-level of the package defines four functions. If you import
compiler (|py2stdlib-compiler|), you will get these functions and a collection of modules
contained in the package.
parse(buf)~
Returns an abstract syntax tree for the Python source code in {buf}. The
function raises SyntaxError if there is an error in the source code. The
return value is a compiler.ast.Module instance that contains the tree.
parseFile(path)~
Return an abstract syntax tree for the Python source code in the file specified
by {path}. It is equivalent to ``parse(open(path).read())``.
walk(ast, visitor[, verbose])~
Do a pre-order walk over the abstract syntax tree {ast}. Call the appropriate
method on the {visitor} instance for each node encountered.
compile(source, filename, mode, flags=None, dont_inherit=None)~
Compile the string {source}, a Python module, statement or expression, into a
code object that can be executed by the exec statement or eval. This
function is a replacement for the built-in compile function.
The {filename} will be used for run-time error messages.
The {mode} must be 'exec' to compile a module, 'single' to compile a single
(interactive) statement, or 'eval' to compile an expression.
The {flags} and {dont_inherit} arguments affect future-related statements, but
are not supported yet.
compileFile(source)~
Compiles the file {source} and generates a .pyc file.
The compiler (|py2stdlib-compiler|) package contains the following modules: ast (|py2stdlib-ast|),
consts, future, misc, pyassem, pycodegen,
symbols, transformer, and visitor.
Limitations
===========
There are some problems with the error checking of the compiler package. The
interpreter detects syntax errors in two distinct phases. One set of errors is
detected by the interpreter's parser, the other set by the compiler. The
compiler package relies on the interpreter's parser, so it get the first phases
of error checking for free. It implements the second phase itself, and that
implementation is incomplete. For example, the compiler package does not raise
an error if a name appears more than once in an argument list: ``def f(x, x):
...``
A future version of the compiler should fix these problems.
Python Abstract Syntax
======================
The compiler.ast (|py2stdlib-compiler.ast|) module defines an abstract syntax for Python. In the
abstract syntax tree, each node represents a syntactic construct. The root of
the tree is Module object.
The abstract syntax offers a higher level interface to parsed Python source
code. The parser (|py2stdlib-parser|) module and the compiler written in C for the Python
interpreter use a concrete syntax tree. The concrete syntax is tied closely to
the grammar description used for the Python parser. Instead of a single node
for a construct, there are often several levels of nested nodes that are
introduced by Python's precedence rules.
The abstract syntax tree is created by the compiler.transformer module.
The transformer relies on the built-in Python parser to generate a concrete
syntax tree. It generates an abstract syntax tree from the concrete tree.
.. index::
single: Stein, Greg
single: Tutt, Bill
The transformer module was created by Greg Stein and Bill Tutt for an
experimental Python-to-C compiler. The current version contains a number of
modifications and improvements, but the basic form of the abstract syntax and of
the transformer are due to Stein and Tutt.
AST Nodes
---------
==============================================================================
*py2stdlib-compiler.ast*
compiler.ast~
The compiler.ast (|py2stdlib-compiler.ast|) module is generated from a text file that describes each
node type and its elements. Each node type is represented as a class that
inherits from the abstract base class compiler.ast.Node and defines a
set of named attributes for child nodes.
Node()~
The Node instances are created automatically by the parser generator.
The recommended interface for specific Node instances is to use the
public attributes to access child nodes. A public attribute may be bound to a
single node or to a sequence of nodes, depending on the Node type. For
example, the bases attribute of the Class node, is bound to a
list of base class nodes, and the doc attribute is bound to a single
node.
Each Node instance has a lineno attribute which may be
``None``. XXX Not sure what the rules are for which nodes will have a useful
lineno.
All Node objects offer the following methods:
getChildren()~
Returns a flattened list of the child nodes and objects in the order they
occur. Specifically, the order of the nodes is the order in which they
appear in the Python grammar. Not all of the children are Node
instances. The names of functions and classes, for example, are plain
strings.
getChildNodes()~
Returns a flattened list of the child nodes in the order they occur. This
method is like getChildren, except that it only returns those
children that are Node instances.
Two examples illustrate the general structure of Node classes. The
while statement is defined by the following grammar production:: >
while_stmt: "while" expression ":" suite
["else" ":" suite]
<
The While node has three attributes: test (|py2stdlib-test|), body, and
else_. (If the natural name for an attribute is also a Python reserved
word, it can't be used as an attribute name. An underscore is appended to the
word to make it a legal identifier, hence else_ instead of
else.)
The if statement is more complicated because it can include several
tests. :: >
if_stmt: 'if' test ':' suite ('elif' test ':' suite)* ['else' ':' suite]
<
The If node only defines two attributes: tests and
else_. The tests attribute is a sequence of test expression,
consequent body pairs. There is one pair for each if/elif
clause. The first element of the pair is the test expression. The second
elements is a Stmt node that contains the code to execute if the test
is true.
The getChildren method of If returns a flat list of child
nodes. If there are three if/elif clauses and no
else clause, then getChildren will return a list of six
elements: the first test expression, the first Stmt, the second text
expression, etc.
The following table lists each of the Node subclasses defined in
compiler.ast (|py2stdlib-compiler.ast|) and each of the public attributes available on their
instances. The values of most of the attributes are themselves Node
instances or sequences of instances. When the value is something other than an
instance, the type is noted in the comment. The attributes are listed in the
order in which they are returned by getChildren and
getChildNodes.
+-----------------------+--------------------+---------------------------------+
| Node type | Attribute | Value |
+=======================+====================+=================================+
| Add | left | left operand |
+-----------------------+--------------------+---------------------------------+
| | right | right operand |
+-----------------------+--------------------+---------------------------------+
| And | nodes | list of operands |
+-----------------------+--------------------+---------------------------------+
| AssAttr | | *attribute as target of |
| | | assignment* |
+-----------------------+--------------------+---------------------------------+
| | expr | expression on the left-hand |
| | | side of the dot |
+-----------------------+--------------------+---------------------------------+
| | attrname | the attribute name, a string |
+-----------------------+--------------------+---------------------------------+
| | flags | XXX |
+-----------------------+--------------------+---------------------------------+
| AssList | nodes | list of list elements being |
| | | assigned to |
+-----------------------+--------------------+---------------------------------+
| AssName | name | name being assigned to |
+-----------------------+--------------------+---------------------------------+
| | flags | XXX |
+-----------------------+--------------------+---------------------------------+
| AssTuple | nodes | list of tuple elements being |
| | | assigned to |
+-----------------------+--------------------+---------------------------------+
| Assert | test (|py2stdlib-test|) | the expression to be tested |
+-----------------------+--------------------+---------------------------------+
| | fail | the value of the |
| | | AssertionError |
+-----------------------+--------------------+---------------------------------+
| Assign | nodes | a list of assignment targets, |
| | | one per equal sign |
+-----------------------+--------------------+---------------------------------+
| | expr | the value being assigned |
+-----------------------+--------------------+---------------------------------+
| AugAssign | node | |
+-----------------------+--------------------+---------------------------------+
| | op | |
+-----------------------+--------------------+---------------------------------+
| | expr | |
+-----------------------+--------------------+---------------------------------+
| Backquote | expr | |
+-----------------------+--------------------+---------------------------------+
| Bitand | nodes | |
+-----------------------+--------------------+---------------------------------+
| Bitor | nodes | |
+-----------------------+--------------------+---------------------------------+
| Bitxor | nodes | |
+-----------------------+--------------------+---------------------------------+
| Break | | |
+-----------------------+--------------------+---------------------------------+
| CallFunc | node | expression for the callee |
+-----------------------+--------------------+---------------------------------+
| | args | a list of arguments |
+-----------------------+--------------------+---------------------------------+
| | star_args | the extended \*-arg value |
+-----------------------+--------------------+---------------------------------+
| | dstar_args | the extended \{\}-arg value |
+-----------------------+--------------------+---------------------------------+
| Class | name | the name of the class, a string |
+-----------------------+--------------------+---------------------------------+
| | bases | a list of base classes |
+-----------------------+--------------------+---------------------------------+
| | doc | doc string, a string or |
| | | ``None`` |
+-----------------------+--------------------+---------------------------------+
| | code (|py2stdlib-code|) | the body of the class statement |
+-----------------------+--------------------+---------------------------------+
| Compare | expr | |
+-----------------------+--------------------+---------------------------------+
| | ops | |
+-----------------------+--------------------+---------------------------------+
| Const | value | |
+-----------------------+--------------------+---------------------------------+
| Continue | | |
+-----------------------+--------------------+---------------------------------+
| Decorators | nodes | List of function decorator |
| | | expressions |
+-----------------------+--------------------+---------------------------------+
| Dict | items | |
+-----------------------+--------------------+---------------------------------+
| Discard | expr | |
+-----------------------+--------------------+---------------------------------+
| Div | left | |
+-----------------------+--------------------+---------------------------------+
| | right | |
+-----------------------+--------------------+---------------------------------+
| Ellipsis | | |
+-----------------------+--------------------+---------------------------------+
| Expression | node | |
+-----------------------+--------------------+---------------------------------+
| Exec | expr | |
+-----------------------+--------------------+---------------------------------+
| | locals | |
+-----------------------+--------------------+---------------------------------+
| | globals | |
+-----------------------+--------------------+---------------------------------+
| FloorDiv | left | |
+-----------------------+--------------------+---------------------------------+
| | right | |
+-----------------------+--------------------+---------------------------------+
| For | assign | |
+-----------------------+--------------------+---------------------------------+
| | list | |
+-----------------------+--------------------+---------------------------------+
| | body | |
+-----------------------+--------------------+---------------------------------+
| | else_ | |
+-----------------------+--------------------+---------------------------------+
| From | modname | |
+-----------------------+--------------------+---------------------------------+
| | names | |
+-----------------------+--------------------+---------------------------------+
| Function | decorators | Decorators or ``None`` |
+-----------------------+--------------------+---------------------------------+
| | name | name used in def, a string |
+-----------------------+--------------------+---------------------------------+
| | argnames | list of argument names, as |
| | | strings |
+-----------------------+--------------------+---------------------------------+
| | defaults | list of default values |
+-----------------------+--------------------+---------------------------------+
| | flags | xxx |
+-----------------------+--------------------+---------------------------------+
| | doc | doc string, a string or |
| | | ``None`` |
+-----------------------+--------------------+---------------------------------+
| | code (|py2stdlib-code|) | the body of the function |
+-----------------------+--------------------+---------------------------------+
| GenExpr | code (|py2stdlib-code|) | |
+-----------------------+--------------------+---------------------------------+
| GenExprFor | assign | |
+-----------------------+--------------------+---------------------------------+
| | iter | |
+-----------------------+--------------------+---------------------------------+
| | ifs | |
+-----------------------+--------------------+---------------------------------+
| GenExprIf | test (|py2stdlib-test|) | |
+-----------------------+--------------------+---------------------------------+
| GenExprInner | expr | |
+-----------------------+--------------------+---------------------------------+
| | quals | |
+-----------------------+--------------------+---------------------------------+
| Getattr | expr | |
+-----------------------+--------------------+---------------------------------+
| | attrname | |
+-----------------------+--------------------+---------------------------------+
| Global | names | |
+-----------------------+--------------------+---------------------------------+
| If | tests | |
+-----------------------+--------------------+---------------------------------+
| | else_ | |
+-----------------------+--------------------+---------------------------------+
| Import | names | |
+-----------------------+--------------------+---------------------------------+
| Invert | expr | |
+-----------------------+--------------------+---------------------------------+
| Keyword | name | |
+-----------------------+--------------------+---------------------------------+
| | expr | |
+-----------------------+--------------------+---------------------------------+
| Lambda | argnames | |
+-----------------------+--------------------+---------------------------------+
| | defaults | |
+-----------------------+--------------------+---------------------------------+
| | flags | |
+-----------------------+--------------------+---------------------------------+
| | code (|py2stdlib-code|) | |
+-----------------------+--------------------+---------------------------------+
| LeftShift | left | |
+-----------------------+--------------------+---------------------------------+
| | right | |
+-----------------------+--------------------+---------------------------------+
| List | nodes | |
+-----------------------+--------------------+---------------------------------+
| ListComp | expr | |
+-----------------------+--------------------+---------------------------------+
| | quals | |
+-----------------------+--------------------+---------------------------------+
| ListCompFor | assign | |
+-----------------------+--------------------+---------------------------------+
| | list | |
+-----------------------+--------------------+---------------------------------+
| | ifs | |
+-----------------------+--------------------+---------------------------------+
| ListCompIf | test (|py2stdlib-test|) | |
+-----------------------+--------------------+---------------------------------+
| Mod | left | |
+-----------------------+--------------------+---------------------------------+
| | right | |
+-----------------------+--------------------+---------------------------------+
| Module | doc | doc string, a string or |
| | | ``None`` |
+-----------------------+--------------------+---------------------------------+
| | node | body of the module, a |
| | | Stmt |
+-----------------------+--------------------+---------------------------------+
| Mul | left | |
+-----------------------+--------------------+---------------------------------+
| | right | |
+-----------------------+--------------------+---------------------------------+
| Name | name | |
+-----------------------+--------------------+---------------------------------+
| Not | expr | |
+-----------------------+--------------------+---------------------------------+
| Or | nodes | |
+-----------------------+--------------------+---------------------------------+
| Pass | | |
+-----------------------+--------------------+---------------------------------+
| Power | left | |
+-----------------------+--------------------+---------------------------------+
| | right | |
+-----------------------+--------------------+---------------------------------+
| Print | nodes | |
+-----------------------+--------------------+---------------------------------+
| | dest | |
+-----------------------+--------------------+---------------------------------+
| Printnl | nodes | |
+-----------------------+--------------------+---------------------------------+
| | dest | |
+-----------------------+--------------------+---------------------------------+
| Raise | expr1 | |
+-----------------------+--------------------+---------------------------------+
| | expr2 | |
+-----------------------+--------------------+---------------------------------+
| | expr3 | |
+-----------------------+--------------------+---------------------------------+
| Return | value | |
+-----------------------+--------------------+---------------------------------+
| RightShift | left | |
+-----------------------+--------------------+---------------------------------+
| | right | |
+-----------------------+--------------------+---------------------------------+
| Slice | expr | |
+-----------------------+--------------------+---------------------------------+
| | flags | |
+-----------------------+--------------------+---------------------------------+
| | lower | |
+-----------------------+--------------------+---------------------------------+
| | upper | |
+-----------------------+--------------------+---------------------------------+
| Sliceobj | nodes | list of statements |
+-----------------------+--------------------+---------------------------------+
| Stmt | nodes | |
+-----------------------+--------------------+---------------------------------+
| Sub | left | |
+-----------------------+--------------------+---------------------------------+
| | right | |
+-----------------------+--------------------+---------------------------------+
| Subscript | expr | |
+-----------------------+--------------------+---------------------------------+
| | flags | |
+-----------------------+--------------------+---------------------------------+
| | subs | |
+-----------------------+--------------------+---------------------------------+
| TryExcept | body | |
+-----------------------+--------------------+---------------------------------+
| | handlers | |
+-----------------------+--------------------+---------------------------------+
| | else_ | |
+-----------------------+--------------------+---------------------------------+
| TryFinally | body | |
+-----------------------+--------------------+---------------------------------+
| | final | |
+-----------------------+--------------------+---------------------------------+
| Tuple | nodes | |
+-----------------------+--------------------+---------------------------------+
| UnaryAdd | expr | |
+-----------------------+--------------------+---------------------------------+
| UnarySub | expr | |
+-----------------------+--------------------+---------------------------------+
| While | test (|py2stdlib-test|) | |
+-----------------------+--------------------+---------------------------------+
| | body | |
+-----------------------+--------------------+---------------------------------+
| | else_ | |
+-----------------------+--------------------+---------------------------------+
| With | expr | |
+-----------------------+--------------------+---------------------------------+
| | vars | |
+-----------------------+--------------------+---------------------------------+
| | body | |
+-----------------------+--------------------+---------------------------------+
| Yield | value | |
+-----------------------+--------------------+---------------------------------+
Assignment nodes
----------------
There is a collection of nodes used to represent assignments. Each assignment
statement in the source code becomes a single Assign node in the AST.
The nodes attribute is a list that contains a node for each assignment
target. This is necessary because assignment can be chained, e.g. ``a = b =
2``. Each Node in the list will be one of the following classes:
AssAttr, AssList, AssName, or AssTuple.
Each target assignment node will describe the kind of object being assigned to:
AssName for a simple name, e.g. ``a = 1``. AssAttr for an
attribute assigned, e.g. ``a.x = 1``. AssList and AssTuple for
list and tuple expansion respectively, e.g. ``a, b, c = a_tuple``.
The target assignment nodes also have a flags attribute that indicates
whether the node is being used for assignment or in a delete statement. The
AssName is also used to represent a delete statement, e.g. :class:`del
x`.
When an expression contains several attribute references, an assignment or
delete statement will contain only one AssAttr node -- for the final
attribute reference. The other attribute references will be represented as
Getattr nodes in the expr attribute of the AssAttr
instance.
Examples
--------
This section shows several simple examples of ASTs for Python source code. The
examples demonstrate how to use the parse function, what the repr of an
AST looks like, and how to access attributes of an AST node.
The first module defines a single function. Assume it is stored in
/tmp/doublelib.py. :: >
"""This is an example module.
This is the docstring.
"""
def double(x):
"Return twice the argument"
return x * 2
<
In the interactive interpreter session below, I have reformatted the long AST
reprs for readability. The AST reprs use unqualified class names. If you want
to create an instance from a repr, you must import the class names from the
compiler.ast (|py2stdlib-compiler.ast|) module. :: >
>>> import compiler
>>> mod = compiler.parseFile("/tmp/doublelib.py")
>>> mod
Module('This is an example module.\n\nThis is the docstring.\n',
Stmt([Function(None, 'double', ['x'], [], 0,
'Return twice the argument',
Stmt([Return(Mul((Name('x'), Const(2))))]))]))
>>> from compiler.ast import *
>>> Module('This is an example module.\n\nThis is the docstring.\n',
... Stmt([Function(None, 'double', ['x'], [], 0,
... 'Return twice the argument',
... Stmt([Return(Mul((Name('x'), Const(2))))]))]))
Module('This is an example module.\n\nThis is the docstring.\n',
Stmt([Function(None, 'double', ['x'], [], 0,
'Return twice the argument',
Stmt([Return(Mul((Name('x'), Const(2))))]))]))
>>> mod.doc
'This is an example module.\n\nThis is the docstring.\n'
>>> for node in mod.node.nodes:
... print node
...
Function(None, 'double', ['x'], [], 0, 'Return twice the argument',
Stmt([Return(Mul((Name('x'), Const(2))))]))
>>> func = mod.node.nodes[0]
>>> func.code
Stmt([Return(Mul((Name('x'), Const(2))))])
<
Using Visitors to Walk ASTs
==============================================================================
*py2stdlib-compiler.visitor*
compiler.visitor~
The visitor pattern is ... The compiler (|py2stdlib-compiler|) package uses a variant on the
visitor pattern that takes advantage of Python's introspection features to
eliminate the need for much of the visitor's infrastructure.
The classes being visited do not need to be programmed to accept visitors. The
visitor need only define visit methods for classes it is specifically interested
in; a default visit method can handle the rest.
XXX The magic visit method for visitors.
walk(tree, visitor[, verbose])~
ASTVisitor()~
The ASTVisitor is responsible for walking over the tree in the correct
order. A walk begins with a call to preorder. For each node, it checks
the {visitor} argument to preorder for a method named 'visitNodeType,'
where NodeType is the name of the node's class, e.g. for a While node a
visitWhile would be called. If the method exists, it is called with the
node as its first argument.
The visitor method for a particular node type can control how child nodes are
visited during the walk. The ASTVisitor modifies the visitor argument
by adding a visit method to the visitor; this method can be used to visit a
particular child node. If no visitor is found for a particular node type, the
default method is called.
ASTVisitor objects have the following methods:
XXX describe extra arguments
default(node[, ...])~
dispatch(node[, ...])~
preorder(tree, visitor)~
Bytecode Generation
===================
The code generator is a visitor that emits bytecodes. Each visit method can
call the emit method to emit a new bytecode. The basic code generator
is specialized for modules, classes, and functions. An assembler converts that
emitted instructions to the low-level bytecode format. It handles things like
generation of constant lists of code objects and calculation of jump offsets.
==============================================================================
*py2stdlib-configparser*
ConfigParser~
:synopsis: Configuration file parser.
.. note::
The ConfigParser (|py2stdlib-configparser|) module has been renamed to configparser in
Python 3.0. The 2to3 tool will automatically adapt imports when
converting your sources to 3.0.
.. index::
pair: .ini; file
pair: configuration; file
single: ini file
single: Windows ini file
This module defines the class ConfigParser (|py2stdlib-configparser|). The ConfigParser (|py2stdlib-configparser|)
class implements a basic configuration file parser language which provides a
structure similar to what you would find on Microsoft Windows INI files. You
can use this to write Python programs which can be customized by end users
easily.
.. note::
This library does {not} interpret or write the value-type prefixes used in
the Windows Registry extended version of INI syntax.
The configuration file consists of sections, led by a ``[section]`` header and
followed by ``name: value`` entries, with continuations in the style of
822 (see section 3.1.1, "LONG HEADER FIELDS"); ``name=value`` is also
accepted. Note that leading whitespace is removed from values. The optional
values can contain format strings which refer to other values in the same
section, or values in a special ``DEFAULT`` section. Additional defaults can be
provided on initialization and retrieval. Lines beginning with ``'#'`` or
``';'`` are ignored and may be used to provide comments.
For example:: >
[My Section]
foodir: %(dir)s/whatever
dir=frob
long: this value continues
in the next line
<
would resolve the ``%(dir)s`` to the value of ``dir`` (``frob`` in this case).
All reference expansions are done on demand.
Default values can be specified by passing them into the ConfigParser (|py2stdlib-configparser|)
constructor as a dictionary. Additional defaults may be passed into the
get method which will override all others.
Sections are normally stored in a built-in dictionary. An alternative dictionary
type can be passed to the ConfigParser (|py2stdlib-configparser|) constructor. For example, if a
dictionary type is passed that sorts its keys, the sections will be sorted on
write-back, as will be the keys within each section.
RawConfigParser([defaults[, dict_type[, allow_no_value]]])~
The basic configuration object. When {defaults} is given, it is initialized
into the dictionary of intrinsic defaults. When {dict_type} is given, it will
be used to create the dictionary objects for the list of sections, for the
options within a section, and for the default values. When {allow_no_value}
is true (default: ``False``), options without values are accepted; the value
presented for these is ``None``.
This class does not
support the magical interpolation behavior.
.. versionadded:: 2.3
.. versionchanged:: 2.6
{dict_type} was added.
.. versionchanged:: 2.7
The default {dict_type} is collections.OrderedDict.
{allow_no_value} was added.
ConfigParser([defaults[, dict_type[, allow_no_value]]])~
Derived class of RawConfigParser that implements the magical
interpolation feature and adds optional arguments to the get and
items methods. The values in {defaults} must be appropriate for the
``%()s`` string interpolation. Note that {__name__} is an intrinsic default;
its value is the section name, and will override any value provided in
{defaults}.
All option names used in interpolation will be passed through the
optionxform method just like any other option name reference. For
example, using the default implementation of optionxform (which converts
option names to lower case), the values ``foo %(bar)s`` and ``foo %(BAR)s`` are
equivalent.
.. versionadded:: 2.3
.. versionchanged:: 2.6
{dict_type} was added.
.. versionchanged:: 2.7
The default {dict_type} is collections.OrderedDict.
{allow_no_value} was added.
SafeConfigParser([defaults[, dict_type[, allow_no_value]]])~
Derived class of ConfigParser (|py2stdlib-configparser|) that implements a more-sane variant of
the magical interpolation feature. This implementation is more predictable as
well. New applications should prefer this version if they don't need to be
compatible with older versions of Python.
.. XXX Need to explain what's safer/more predictable about it.
.. versionadded:: 2.3
.. versionchanged:: 2.6
{dict_type} was added.
.. versionchanged:: 2.7
The default {dict_type} is collections.OrderedDict.
{allow_no_value} was added.
NoSectionError~
Exception raised when a specified section is not found.
DuplicateSectionError~
Exception raised if add_section is called with the name of a section
that is already present.
NoOptionError~
Exception raised when a specified option is not found in the specified section.
InterpolationError~
Base class for exceptions raised when problems occur performing string
interpolation.
InterpolationDepthError~
Exception raised when string interpolation cannot be completed because the
number of iterations exceeds MAX_INTERPOLATION_DEPTH. Subclass of
InterpolationError.
InterpolationMissingOptionError~
Exception raised when an option referenced from a value does not exist. Subclass
of InterpolationError.
.. versionadded:: 2.3
InterpolationSyntaxError~
Exception raised when the source text into which substitutions are made does not
conform to the required syntax. Subclass of InterpolationError.
.. versionadded:: 2.3
MissingSectionHeaderError~
Exception raised when attempting to parse a file which has no section headers.
ParsingError~
Exception raised when errors occur attempting to parse a file.
MAX_INTERPOLATION_DEPTH~
The maximum depth for recursive interpolation for get when the {raw}
parameter is false. This is relevant only for the ConfigParser (|py2stdlib-configparser|) class.
.. seealso::
Module shlex (|py2stdlib-shlex|)
Support for a creating Unix shell-like mini-languages which can be used as an
alternate format for application configuration files.
RawConfigParser Objects
-----------------------
RawConfigParser instances have the following methods:
RawConfigParser.defaults()~
Return a dictionary containing the instance-wide defaults.
RawConfigParser.sections()~
Return a list of the sections available; ``DEFAULT`` is not included in the
list.
RawConfigParser.add_section(section)~
Add a section named {section} to the instance. If a section by the given name
already exists, DuplicateSectionError is raised. If the name
``DEFAULT`` (or any of it's case-insensitive variants) is passed,
ValueError is raised.
RawConfigParser.has_section(section)~
Indicates whether the named section is present in the configuration. The
``DEFAULT`` section is not acknowledged.
RawConfigParser.options(section)~
Returns a list of options available in the specified {section}.
RawConfigParser.has_option(section, option)~
If the given section exists, and contains the given option, return
True; otherwise return False.
.. versionadded:: 1.6
RawConfigParser.read(filenames)~
Attempt to read and parse a list of filenames, returning a list of filenames
which were successfully parsed. If {filenames} is a string or Unicode string,
it is treated as a single filename. If a file named in {filenames} cannot be
opened, that file will be ignored. This is designed so that you can specify a
list of potential configuration file locations (for example, the current
directory, the user's home directory, and some system-wide directory), and all
existing configuration files in the list will be read. If none of the named
files exist, the ConfigParser (|py2stdlib-configparser|) instance will contain an empty dataset.
An application which requires initial values to be loaded from a file should
load the required file or files using readfp before calling read
for any optional files:: >
import ConfigParser, os
config = ConfigParser.ConfigParser()
config.readfp(open('defaults.cfg'))
config.read(['site.cfg', os.path.expanduser('~/.myapp.cfg')])
<
.. versionchanged:: 2.4
Returns list of successfully parsed filenames.
RawConfigParser.readfp(fp[, filename])~
Read and parse configuration data from the file or file-like object in {fp}
(only the readline (|py2stdlib-readline|) method is used). If {filename} is omitted and {fp}
has a name attribute, that is used for {filename}; the default is
``<???>``.
RawConfigParser.get(section, option)~
Get an {option} value for the named {section}.
RawConfigParser.getint(section, option)~
A convenience method which coerces the {option} in the specified {section} to an
integer.
RawConfigParser.getfloat(section, option)~
A convenience method which coerces the {option} in the specified {section} to a
floating point number.
RawConfigParser.getboolean(section, option)~
A convenience method which coerces the {option} in the specified {section} to a
Boolean value. Note that the accepted values for the option are ``"1"``,
``"yes"``, ``"true"``, and ``"on"``, which cause this method to return ``True``,
and ``"0"``, ``"no"``, ``"false"``, and ``"off"``, which cause it to return
``False``. These string values are checked in a case-insensitive manner. Any
other value will cause it to raise ValueError.
RawConfigParser.items(section)~
Return a list of ``(name, value)`` pairs for each option in the given {section}.
RawConfigParser.set(section, option, value)~
If the given section exists, set the given option to the specified value;
otherwise raise NoSectionError. While it is possible to use
RawConfigParser (or ConfigParser (|py2stdlib-configparser|) with {raw} parameters set to
true) for {internal} storage of non-string values, full functionality (including
interpolation and output to files) can only be achieved using string values.
.. versionadded:: 1.6
RawConfigParser.write(fileobject)~
Write a representation of the configuration to the specified file object. This
representation can be parsed by a future read call.
.. versionadded:: 1.6
RawConfigParser.remove_option(section, option)~
Remove the specified {option} from the specified {section}. If the section does
not exist, raise NoSectionError. If the option existed to be removed,
return True; otherwise return False.
.. versionadded:: 1.6
RawConfigParser.remove_section(section)~
Remove the specified {section} from the configuration. If the section in fact
existed, return ``True``. Otherwise return ``False``.
RawConfigParser.optionxform(option)~
Transforms the option name {option} as found in an input file or as passed in
by client code to the form that should be used in the internal structures.
The default implementation returns a lower-case version of {option};
subclasses may override this or client code can set an attribute of this name
on instances to affect this behavior.
You don't necessarily need to subclass a ConfigParser to use this method, you
can also re-set it on an instance, to a function that takes a string
argument. Setting it to ``str``, for example, would make option names case
sensitive:: >
cfgparser = ConfigParser()
...
cfgparser.optionxform = str
<
Note that when reading configuration files, whitespace around the
option names are stripped before optionxform is called.
ConfigParser Objects
--------------------
The ConfigParser (|py2stdlib-configparser|) class extends some methods of the
RawConfigParser interface, adding some optional arguments.
ConfigParser.get(section, option[, raw[, vars]])~
Get an {option} value for the named {section}. All the ``'%'`` interpolations
are expanded in the return values, based on the defaults passed into the
constructor, as well as the options {vars} provided, unless the {raw} argument
is true.
ConfigParser.items(section[, raw[, vars]])~
Return a list of ``(name, value)`` pairs for each option in the given {section}.
Optional arguments have the same meaning as for the get method.
.. versionadded:: 2.3
SafeConfigParser Objects
------------------------
The SafeConfigParser class implements the same extended interface as
ConfigParser (|py2stdlib-configparser|), with the following addition:
SafeConfigParser.set(section, option, value)~
If the given section exists, set the given option to the specified value;
otherwise raise NoSectionError. {value} must be a string (str
or unicode); if not, TypeError is raised.
.. versionadded:: 2.4
Examples
--------
An example of writing to a configuration file:: >
import ConfigParser
config = ConfigParser.RawConfigParser()
# When adding sections or items, add them in the reverse order of
# how you want them to be displayed in the actual file.
# In addition, please note that using RawConfigParser's and the raw
# mode of ConfigParser's respective set functions, you can assign
# non-string values to keys internally, but will receive an error
# when attempting to write to a file or when you get it in non-raw
# mode. SafeConfigParser does not allow such assignments to take place.
config.add_section('Section1')
config.set('Section1', 'int', '15')
config.set('Section1', 'bool', 'true')
config.set('Section1', 'float', '3.1415')
config.set('Section1', 'baz', 'fun')
config.set('Section1', 'bar', 'Python')
config.set('Section1', 'foo', '%(bar)s is %(baz)s!')
# Writing our configuration file to 'example.cfg'
with open('example.cfg', 'wb') as configfile:
config.write(configfile)
<
An example of reading the configuration file again::
import ConfigParser
config = ConfigParser.RawConfigParser()
config.read('example.cfg')
# getfloat() raises an exception if the value is not a float
# getint() and getboolean() also do this for their respective types
float = config.getfloat('Section1', 'float')
int = config.getint('Section1', 'int')
print float + int
# Notice that the next output does not interpolate '%(bar)s' or '%(baz)s'.
# This is because we are using a RawConfigParser().
if config.getboolean('Section1', 'bool'):
print config.get('Section1', 'foo')
To get interpolation, you will need to use a ConfigParser (|py2stdlib-configparser|) or
SafeConfigParser:: >
import ConfigParser
config = ConfigParser.ConfigParser()
config.read('example.cfg')
# Set the third, optional argument of get to 1 if you wish to use raw mode.
print config.get('Section1', 'foo', 0) # -> "Python is fun!"
print config.get('Section1', 'foo', 1) # -> "%(bar)s is %(baz)s!"
# The optional fourth argument is a dict with members that will take
# precedence in interpolation.
print config.get('Section1', 'foo', 0, {'bar': 'Documentation',
'baz': 'evil'})
<
Defaults are available in all three types of ConfigParsers. They are used in
interpolation if an option used is not defined elsewhere. :: >
import ConfigParser
# New instance with 'bar' and 'baz' defaulting to 'Life' and 'hard' each
config = ConfigParser.SafeConfigParser({'bar': 'Life', 'baz': 'hard'})
config.read('example.cfg')
print config.get('Section1', 'foo') # -> "Python is fun!"
config.remove_option('Section1', 'bar')
config.remove_option('Section1', 'baz')
print config.get('Section1', 'foo') # -> "Life is hard!"
<
The function ``opt_move`` below can be used to move options between sections::
def opt_move(config, section1, section2, option):
try:
config.set(section2, option, config.get(section1, option, 1))
except ConfigParser.NoSectionError:
# Create non-existent section
config.add_section(section2)
opt_move(config, section1, section2, option)
else:
config.remove_option(section1, option)
Some configuration files are known to include settings without values, but which
otherwise conform to the syntax supported by ConfigParser (|py2stdlib-configparser|). The
{allow_no_value} parameter to the constructor can be used to indicate that such
values should be accepted:
.. doctest::
>>> import ConfigParser
>>> import io
>>> sample_config = """
... [mysqld]
... user = mysql
... pid-file = /var/run/mysqld/mysqld.pid
... skip-external-locking
... old_passwords = 1
... skip-bdb
... skip-innodb
... """
>>> config = ConfigParser.RawConfigParser(allow_no_value=True)
>>> config.readfp(io.BytesIO(sample_config))
>>> # Settings with values are treated as before:
>>> config.get("mysqld", "user")
'mysql'
>>> # Settings without values provide None:
>>> config.get("mysqld", "skip-bdb")
>>> # Settings which aren't specified still raise an error:
>>> config.get("mysqld", "does-not-exist")
Traceback (most recent call last):
...
ConfigParser.NoOptionError: No option 'does-not-exist' in section: 'mysqld'
==============================================================================
*py2stdlib-contextlib*
contextlib~
:synopsis: Utilities for with-statement contexts.
.. versionadded:: 2.5
This module provides utilities for common tasks involving the with
statement. For more information see also typecontextmanager and
context-managers.
Functions provided:
contextmanager(func)~
This function is a decorator that can be used to define a factory
function for with statement context managers, without needing to
create a class or separate __enter__ and __exit__ methods.
A simple example (this is not recommended as a real way of generating HTML!):: >
from contextlib import contextmanager
@contextmanager
def tag(name):
print "<%s>" % name
yield
print "</%s>" % name
>>> with tag("h1"):
... print "foo"
...
<h1>
foo
</h1>
<
The function being decorated must return a generator-iterator when
called. This iterator must yield exactly one value, which will be bound to
the targets in the with statement's as clause, if any.
At the point where the generator yields, the block nested in the with
statement is executed. The generator is then resumed after the block is exited.
If an unhandled exception occurs in the block, it is reraised inside the
generator at the point where the yield occurred. Thus, you can use a
try...\ except...\ finally statement to trap
the error (if any), or ensure that some cleanup takes place. If an exception is
trapped merely in order to log it or to perform some action (rather than to
suppress it entirely), the generator must reraise that exception. Otherwise the
generator context manager will indicate to the with statement that
the exception has been handled, and execution will resume with the statement
immediately following the with statement.
nested(mgr1[, mgr2[, ...]])~
Combine multiple context managers into a single nested context manager.
This function has been deprecated in favour of the multiple manager form
of the with statement.
The one advantage of this function over the multiple manager form of the
with statement is that argument unpacking allows it to be
used with a variable number of context managers as follows:: >
from contextlib import nested
with nested(*managers):
do_something()
<
Note that if the __exit__ method of one of the nested context managers
indicates an exception should be suppressed, no exception information will be
passed to any remaining outer context managers. Similarly, if the
__exit__ method of one of the nested managers raises an exception, any
previous exception state will be lost; the new exception will be passed to the
__exit__ methods of any remaining outer context managers. In general,
__exit__ methods should avoid raising exceptions, and in particular they
should not re-raise a passed-in exception.
This function has two major quirks that have led to it being deprecated. Firstly,
as the context managers are all constructed before the function is invoked, the
__new__ and __init__ methods of the inner context managers are
not actually covered by the scope of the outer context managers. That means, for
example, that using nested to open two files is a programming error as the
first file will not be closed promptly if an exception is thrown when opening
the second file.
Secondly, if the __enter__ method of one of the inner context managers
raises an exception that is caught and suppressed by the __exit__ method
of one of the outer context managers, this construct will raise
RuntimeError rather than skipping the body of the with
statement.
Developers that need to support nesting of a variable number of context managers
can either use the warnings (|py2stdlib-warnings|) module to suppress the DeprecationWarning
raised by this function or else use this function as a model for an application
specific implementation.
2.7~
The with-statement now supports this functionality directly (without the
confusing error prone quirks).
closing(thing)~
Return a context manager that closes {thing} upon completion of the block. This
is basically equivalent to:: >
from contextlib import contextmanager
@contextmanager
def closing(thing):
try:
yield thing
finally:
thing.close()
<
And lets you write code like this::
from contextlib import closing
import urllib
with closing(urllib.urlopen('http://www.python.org')) as page:
for line in page:
print line
without needing to explicitly close ``page``. Even if an error occurs,
``page.close()`` will be called when the with block is exited.
.. seealso::
0343 - The "with" statement
The specification, background, and examples for the Python with
statement.
==============================================================================
*py2stdlib-cookie*
Cookie~
:synopsis: Support for HTTP state management (cookies).
.. note::
The Cookie (|py2stdlib-cookie|) module has been renamed to http.cookies in Python
3.0. The 2to3 tool will automatically adapt imports when converting
your sources to 3.0.
The Cookie (|py2stdlib-cookie|) module defines classes for abstracting the concept of
cookies, an HTTP state management mechanism. It supports both simple string-only
cookies, and provides an abstraction for having any serializable data-type as
cookie value.
The module formerly strictly applied the parsing rules described in the
2109 and 2068 specifications. It has since been discovered that
MSIE 3.0x doesn't follow the character rules outlined in those specs. As a
result, the parsing rules used are a bit less strict.
.. note::
On encountering an invalid cookie, CookieError is raised, so if your
cookie data comes from a browser you should always prepare for invalid data
and catch CookieError on parsing.
CookieError~
Exception failing because of 2109 invalidity: incorrect attributes,
incorrect Set-Cookie header, etc.
BaseCookie([input])~
This class is a dictionary-like object whose keys are strings and whose values
are Morsel instances. Note that upon setting a key to a value, the
value is first converted to a Morsel containing the key and the value.
If {input} is given, it is passed to the load method.
SimpleCookie([input])~
This class derives from BaseCookie and overrides value_decode
and value_encode to be the identity and str respectively.
SerialCookie([input])~
This class derives from BaseCookie and overrides value_decode
and value_encode to be the pickle.loads and
pickle.dumps.
2.3~
Reading pickled values from untrusted cookie data is a huge security hole, as
pickle strings can be crafted to cause arbitrary code to execute on your server.
It is supported for backwards compatibility only, and may eventually go away.
SmartCookie([input])~
This class derives from BaseCookie. It overrides value_decode
to be pickle.loads if it is a valid pickle, and otherwise the value
itself. It overrides value_encode to be pickle.dumps unless it
is a string, in which case it returns the value itself.
2.3~
The same security warning from SerialCookie applies here.
A further security note is warranted. For backwards compatibility, the
Cookie (|py2stdlib-cookie|) module exports a class named Cookie (|py2stdlib-cookie|) which is just an
alias for SmartCookie. This is probably a mistake and will likely be
removed in a future version. You should not use the Cookie (|py2stdlib-cookie|) class in
your applications, for the same reason why you should not use the
SerialCookie class.
.. seealso::
Module cookielib (|py2stdlib-cookielib|)
HTTP cookie handling for web {clients}. The cookielib (|py2stdlib-cookielib|) and Cookie (|py2stdlib-cookie|)
modules do not depend on each other.
2109 - HTTP State Management Mechanism
This is the state management specification implemented by this module.
Cookie Objects
--------------
BaseCookie.value_decode(val)~
Return a decoded value from a string representation. Return value can be any
type. This method does nothing in BaseCookie --- it exists so it can be
overridden.
BaseCookie.value_encode(val)~
Return an encoded value. {val} can be any type, but return value must be a
string. This method does nothing in BaseCookie --- it exists so it can
be overridden
In general, it should be the case that value_encode and
value_decode are inverses on the range of {value_decode}.
BaseCookie.output([attrs[, header[, sep]]])~
Return a string representation suitable to be sent as HTTP headers. {attrs} and
{header} are sent to each Morsel's output method. {sep} is used
to join the headers together, and is by default the combination ``'\r\n'``
(CRLF).
.. versionchanged:: 2.5
The default separator has been changed from ``'\n'`` to match the cookie
specification.
BaseCookie.js_output([attrs])~
Return an embeddable JavaScript snippet, which, if run on a browser which
supports JavaScript, will act the same as if the HTTP headers was sent.
The meaning for {attrs} is the same as in output.
BaseCookie.load(rawdata)~
If {rawdata} is a string, parse it as an ``HTTP_COOKIE`` and add the values
found there as Morsel\ s. If it is a dictionary, it is equivalent to:: >
for k, v in rawdata.items():
cookie[k] = v
<
Morsel Objects
Morsel~
Abstract a key/value pair, which has some 2109 attributes.
Morsels are dictionary-like objects, whose set of keys is constant --- the valid
2109 attributes, which are
* ``expires``
* ``path``
* ``comment``
* ``domain``
* ``max-age``
* ``secure``
* ``version``
* ``httponly``
The attribute httponly specifies that the cookie is only transfered
in HTTP requests, and is not accessible through JavaScript. This is intended
to mitigate some forms of cross-site scripting.
The keys are case-insensitive.
.. versionadded:: 2.6
The httponly attribute was added.
Morsel.value~
The value of the cookie.
Morsel.coded_value~
The encoded value of the cookie --- this is what should be sent.
Morsel.key~
The name of the cookie.
Morsel.set(key, value, coded_value)~
Set the {key}, {value} and {coded_value} members.
Morsel.isReservedKey(K)~
Whether {K} is a member of the set of keys of a Morsel.
Morsel.output([attrs[, header]])~
Return a string representation of the Morsel, suitable to be sent as an HTTP
header. By default, all the attributes are included, unless {attrs} is given, in
which case it should be a list of attributes to use. {header} is by default
``"Set-Cookie:"``.
Morsel.js_output([attrs])~
Return an embeddable JavaScript snippet, which, if run on a browser which
supports JavaScript, will act the same as if the HTTP header was sent.
The meaning for {attrs} is the same as in output.
Morsel.OutputString([attrs])~
Return a string representing the Morsel, without any surrounding HTTP or
JavaScript.
The meaning for {attrs} is the same as in output.
Example
-------
The following example demonstrates how to use the Cookie (|py2stdlib-cookie|) module.
.. doctest::
:options: +NORMALIZE_WHITESPACE
>>> import Cookie
>>> C = Cookie.SimpleCookie()
>>> C = Cookie.SerialCookie()
>>> C = Cookie.SmartCookie()
>>> C["fig"] = "newton"
>>> C["sugar"] = "wafer"
>>> print C # generate HTTP headers
Set-Cookie: fig=newton
Set-Cookie: sugar=wafer
>>> print C.output() # same thing
Set-Cookie: fig=newton
Set-Cookie: sugar=wafer
>>> C = Cookie.SmartCookie()
>>> C["rocky"] = "road"
>>> C["rocky"]["path"] = "/cookie"
>>> print C.output(header="Cookie:")
Cookie: rocky=road; Path=/cookie
>>> print C.output(attrs=[], header="Cookie:")
Cookie: rocky=road
>>> C = Cookie.SmartCookie()
>>> C.load("chips=ahoy; vienna=finger") # load from a string (HTTP header)
>>> print C
Set-Cookie: chips=ahoy
Set-Cookie: vienna=finger
>>> C = Cookie.SmartCookie()
>>> C.load('keebler="E=everybody; L=\\"Loves\\"; fudge=\\012;";')
>>> print C
Set-Cookie: keebler="E=everybody; L=\"Loves\"; fudge=\012;"
>>> C = Cookie.SmartCookie()
>>> C["oreo"] = "doublestuff"
>>> C["oreo"]["path"] = "/"
>>> print C
Set-Cookie: oreo=doublestuff; Path=/
>>> C = Cookie.SmartCookie()
>>> C["twix"] = "none for you"
>>> C["twix"].value
'none for you'
>>> C = Cookie.SimpleCookie()
>>> C["number"] = 7 # equivalent to C["number"] = str(7)
>>> C["string"] = "seven"
>>> C["number"].value
'7'
>>> C["string"].value
'seven'
>>> print C
Set-Cookie: number=7
Set-Cookie: string=seven
>>> C = Cookie.SerialCookie()
>>> C["number"] = 7
>>> C["string"] = "seven"
>>> C["number"].value
7
>>> C["string"].value
'seven'
>>> print C
Set-Cookie: number="I7\012."
Set-Cookie: string="S'seven'\012p1\012."
>>> C = Cookie.SmartCookie()
>>> C["number"] = 7
>>> C["string"] = "seven"
>>> C["number"].value
7
>>> C["string"].value
'seven'
>>> print C
Set-Cookie: number="I7\012."
Set-Cookie: string=seven
==============================================================================
*py2stdlib-cookielib*
cookielib~
:synopsis: Classes for automatic handling of HTTP cookies.
.. note::
The cookielib (|py2stdlib-cookielib|) module has been renamed to http.cookiejar in
Python 3.0. The 2to3 tool will automatically adapt imports when
converting your sources to 3.0.
.. versionadded:: 2.4
The cookielib (|py2stdlib-cookielib|) module defines classes for automatic handling of HTTP
cookies. It is useful for accessing web sites that require small pieces of data
-- cookies -- to be set on the client machine by an HTTP response from a
web server, and then returned to the server in later HTTP requests.
Both the regular Netscape cookie protocol and the protocol defined by
2965 are handled. RFC 2965 handling is switched off by default.
2109 cookies are parsed as Netscape cookies and subsequently treated
either as Netscape or RFC 2965 cookies according to the 'policy' in effect.
Note that the great majority of cookies on the Internet are Netscape cookies.
cookielib (|py2stdlib-cookielib|) attempts to follow the de-facto Netscape cookie protocol (which
differs substantially from that set out in the original Netscape specification),
including taking note of the ``max-age`` and ``port`` cookie-attributes
introduced with RFC 2965.
.. note::
The various named parameters found in Set-Cookie and
Set-Cookie2 headers (eg. ``domain`` and ``expires``) are
conventionally referred to as attributes. To distinguish them from
Python attributes, the documentation for this module uses the term
cookie-attribute instead.
The module defines the following exception:
LoadError~
Instances of FileCookieJar raise this exception on failure to load
cookies from a file.
.. note:: >
For backwards-compatibility with Python 2.4 (which raised an IOError),
LoadError is a subclass of IOError.
<
The following classes are provided:
CookieJar(policy=None)~
{policy} is an object implementing the CookiePolicy interface.
The CookieJar class stores HTTP cookies. It extracts cookies from HTTP
requests, and returns them in HTTP responses. CookieJar instances
automatically expire contained cookies when necessary. Subclasses are also
responsible for storing and retrieving cookies from a file or database.
FileCookieJar(filename, delayload=None, policy=None)~
{policy} is an object implementing the CookiePolicy interface. For the
other arguments, see the documentation for the corresponding attributes.
A CookieJar which can load cookies from, and perhaps save cookies to, a
file on disk. Cookies are {NOT}* loaded from the named file until either the
load or revert method is called. Subclasses of this class are
documented in section file-cookie-jar-classes.
CookiePolicy()~
This class is responsible for deciding whether each cookie should be accepted
from / returned to the server.
DefaultCookiePolicy( blocked_domains=None, allowed_domains=None, netscape=True, rfc2965=False, rfc2109_as_netscape=None, hide_cookie2=False, strict_domain=False, strict_rfc2965_unverifiable=True, strict_ns_unverifiable=False, strict_ns_domain=DefaultCookiePolicy.DomainLiberal, strict_ns_set_initial_dollar=False, strict_ns_set_path=False )~
Constructor arguments should be passed as keyword arguments only.
{blocked_domains} is a sequence of domain names that we never accept cookies
from, nor return cookies to. {allowed_domains} if not None, this is a
sequence of the only domains for which we accept and return cookies. For all
other arguments, see the documentation for CookiePolicy and
DefaultCookiePolicy objects.
DefaultCookiePolicy implements the standard accept / reject rules for
Netscape and RFC 2965 cookies. By default, RFC 2109 cookies (ie. cookies
received in a Set-Cookie header with a version cookie-attribute of
1) are treated according to the RFC 2965 rules. However, if RFC 2965 handling
is turned off or rfc2109_as_netscape is True, RFC 2109 cookies are
'downgraded' by the CookieJar instance to Netscape cookies, by
setting the version attribute of the Cookie (|py2stdlib-cookie|) instance to 0.
DefaultCookiePolicy also provides some parameters to allow some
fine-tuning of policy.
Cookie()~
This class represents Netscape, RFC 2109 and RFC 2965 cookies. It is not
expected that users of cookielib (|py2stdlib-cookielib|) construct their own Cookie (|py2stdlib-cookie|)
instances. Instead, if necessary, call make_cookies on a
CookieJar instance.
.. seealso::
Module urllib2 (|py2stdlib-urllib2|)
URL opening with automatic cookie handling.
Module Cookie (|py2stdlib-cookie|)
HTTP cookie classes, principally useful for server-side code. The
cookielib (|py2stdlib-cookielib|) and Cookie (|py2stdlib-cookie|) modules do not depend on each other.
http://wwwsearch.sourceforge.net/mechanize/
Extensions to this module, including a class for reading Microsoft Internet
Explorer cookies on Windows.
http://wp.netscape.com/newsref/std/cookie_spec.html
The specification of the original Netscape cookie protocol. Though this is
still the dominant protocol, the 'Netscape cookie protocol' implemented by all
the major browsers (and cookielib (|py2stdlib-cookielib|)) only bears a passing resemblance to
the one sketched out in ``cookie_spec.html``.
2109 - HTTP State Management Mechanism
Obsoleted by RFC 2965. Uses Set-Cookie with version=1.
2965 - HTTP State Management Mechanism
The Netscape protocol with the bugs fixed. Uses Set-Cookie2 in
place of Set-Cookie. Not widely used.
http://kristol.org/cookie/errata.html
Unfinished errata to RFC 2965.
2964 - Use of HTTP State Management
CookieJar and FileCookieJar Objects
-----------------------------------
CookieJar objects support the iterator protocol for iterating over
contained Cookie (|py2stdlib-cookie|) objects.
CookieJar has the following methods:
CookieJar.add_cookie_header(request)~
Add correct Cookie (|py2stdlib-cookie|) header to {request}.
If policy allows (ie. the rfc2965 and hide_cookie2 attributes of
the CookieJar's CookiePolicy instance are true and false
respectively), the Cookie2 header is also added when appropriate.
The {request} object (usually a urllib2.Request instance) must support
the methods get_full_url, get_host, get_type,
unverifiable, get_origin_req_host, has_header,
get_header, header_items, and add_unredirected_header,as
documented by urllib2 (|py2stdlib-urllib2|).
CookieJar.extract_cookies(response, request)~
Extract cookies from HTTP {response} and store them in the CookieJar,
where allowed by policy.
The CookieJar will look for allowable Set-Cookie and
Set-Cookie2 headers in the {response} argument, and store cookies
as appropriate (subject to the CookiePolicy.set_ok method's approval).
The {response} object (usually the result of a call to urllib2.urlopen,
or similar) should support an info method, which returns an object with
a getallmatchingheaders method (usually a mimetools.Message
instance).
The {request} object (usually a urllib2.Request instance) must support
the methods get_full_url, get_host, unverifiable, and
get_origin_req_host, as documented by urllib2 (|py2stdlib-urllib2|). The request is
used to set default values for cookie-attributes as well as for checking that
the cookie is allowed to be set.
CookieJar.set_policy(policy)~
Set the CookiePolicy instance to be used.
CookieJar.make_cookies(response, request)~
Return sequence of Cookie (|py2stdlib-cookie|) objects extracted from {response} object.
See the documentation for extract_cookies for the interfaces required of
the {response} and {request} arguments.
CookieJar.set_cookie_if_ok(cookie, request)~
Set a Cookie (|py2stdlib-cookie|) if policy says it's OK to do so.
CookieJar.set_cookie(cookie)~
Set a Cookie (|py2stdlib-cookie|), without checking with policy to see whether or not it
should be set.
CookieJar.clear([domain[, path[, name]]])~
Clear some cookies.
If invoked without arguments, clear all cookies. If given a single argument,
only cookies belonging to that {domain} will be removed. If given two arguments,
cookies belonging to the specified {domain} and URL {path} are removed. If
given three arguments, then the cookie with the specified {domain}, {path} and
{name} is removed.
Raises KeyError if no matching cookie exists.
CookieJar.clear_session_cookies()~
Discard all session cookies.
Discards all contained cookies that have a true discard attribute
(usually because they had either no ``max-age`` or ``expires`` cookie-attribute,
or an explicit ``discard`` cookie-attribute). For interactive browsers, the end
of a session usually corresponds to closing the browser window.
Note that the save method won't save session cookies anyway, unless you
ask otherwise by passing a true {ignore_discard} argument.
FileCookieJar implements the following additional methods:
FileCookieJar.save(filename=None, ignore_discard=False, ignore_expires=False)~
Save cookies to a file.
This base class raises NotImplementedError. Subclasses may leave this
method unimplemented.
{filename} is the name of file in which to save cookies. If {filename} is not
specified, self.filename is used (whose default is the value passed to
the constructor, if any); if self.filename is None,
ValueError is raised.
{ignore_discard}: save even cookies set to be discarded. {ignore_expires}: save
even cookies that have expired
The file is overwritten if it already exists, thus wiping all the cookies it
contains. Saved cookies can be restored later using the load or
revert methods.
FileCookieJar.load(filename=None, ignore_discard=False, ignore_expires=False)~
Load cookies from a file.
Old cookies are kept unless overwritten by newly loaded ones.
Arguments are as for save.
The named file must be in the format understood by the class, or
LoadError will be raised. Also, IOError may be raised, for
example if the file does not exist.
.. note:: >
For backwards-compatibility with Python 2.4 (which raised an IOError),
LoadError is a subclass of IOError.
<
FileCookieJar.revert(filename=None, ignore_discard=False, ignore_expires=False)~
Clear all cookies and reload cookies from a saved file.
revert can raise the same exceptions as load. If there is a
failure, the object's state will not be altered.
FileCookieJar instances have the following public attributes:
FileCookieJar.filename~
Filename of default file in which to keep cookies. This attribute may be
assigned to.
FileCookieJar.delayload~
If true, load cookies lazily from disk. This attribute should not be assigned
to. This is only a hint, since this only affects performance, not behaviour
(unless the cookies on disk are changing). A CookieJar object may
ignore it. None of the FileCookieJar classes included in the standard
library lazily loads cookies.
FileCookieJar subclasses and co-operation with web browsers
-----------------------------------------------------------
The following CookieJar subclasses are provided for reading and writing
. Further CookieJar subclasses, including one that reads Microsoft
Internet Explorer cookies, are available at
http://wwwsearch.sourceforge.net/mechanize/ .
MozillaCookieJar(filename, delayload=None, policy=None)~
A FileCookieJar that can load from and save cookies to disk in the
Mozilla ``cookies.txt`` file format (which is also used by the Lynx and Netscape
browsers).
.. note:: >
Version 3 of the Firefox web browser no longer writes cookies in the
``cookies.txt`` file format.
<
.. note::
This loses information about RFC 2965 cookies, and also about newer or
non-standard cookie-attributes such as ``port``.
.. warning:: >
Back up your cookies before saving if you have cookies whose loss / corruption
would be inconvenient (there are some subtleties which may lead to slight
changes in the file over a load / save round-trip).
<
Also note that cookies saved while Mozilla is running will get clobbered by
Mozilla.
LWPCookieJar(filename, delayload=None, policy=None)~
A FileCookieJar that can load from and save cookies to disk in format
compatible with the libwww-perl library's ``Set-Cookie3`` file format. This is
convenient if you want to store cookies in a human-readable file.
CookiePolicy Objects
--------------------
Objects implementing the CookiePolicy interface have the following
methods:
CookiePolicy.set_ok(cookie, request)~
Return boolean value indicating whether cookie should be accepted from server.
{cookie} is a cookielib.Cookie instance. {request} is an object
implementing the interface defined by the documentation for
CookieJar.extract_cookies.
CookiePolicy.return_ok(cookie, request)~
Return boolean value indicating whether cookie should be returned to server.
{cookie} is a cookielib.Cookie instance. {request} is an object
implementing the interface defined by the documentation for
CookieJar.add_cookie_header.
CookiePolicy.domain_return_ok(domain, request)~
Return false if cookies should not be returned, given cookie domain.
This method is an optimization. It removes the need for checking every cookie
with a particular domain (which might involve reading many files). Returning
true from domain_return_ok and path_return_ok leaves all the
work to return_ok.
If domain_return_ok returns true for the cookie domain,
path_return_ok is called for the cookie path. Otherwise,
path_return_ok and return_ok are never called for that cookie
domain. If path_return_ok returns true, return_ok is called
with the Cookie (|py2stdlib-cookie|) object itself for a full check. Otherwise,
return_ok is never called for that cookie path.
Note that domain_return_ok is called for every {cookie} domain, not just
for the {request} domain. For example, the function might be called with both
``".example.com"`` and ``"www.example.com"`` if the request domain is
``"www.example.com"``. The same goes for path_return_ok.
The {request} argument is as documented for return_ok.
CookiePolicy.path_return_ok(path, request)~
Return false if cookies should not be returned, given cookie path.
See the documentation for domain_return_ok.
In addition to implementing the methods above, implementations of the
CookiePolicy interface must also supply the following attributes,
indicating which protocols should be used, and how. All of these attributes may
be assigned to.
CookiePolicy.netscape~
Implement Netscape protocol.
CookiePolicy.rfc2965~
Implement RFC 2965 protocol.
CookiePolicy.hide_cookie2~
Don't add Cookie2 header to requests (the presence of this header
indicates to the server that we understand RFC 2965 cookies).
The most useful way to define a CookiePolicy class is by subclassing
from DefaultCookiePolicy and overriding some or all of the methods
above. CookiePolicy itself may be used as a 'null policy' to allow
setting and receiving any and all cookies (this is unlikely to be useful).
DefaultCookiePolicy Objects
---------------------------
Implements the standard rules for accepting and returning cookies.
Both RFC 2965 and Netscape cookies are covered. RFC 2965 handling is switched
off by default.
The easiest way to provide your own policy is to override this class and call
its methods in your overridden implementations before adding your own additional
checks:: >
import cookielib
class MyCookiePolicy(cookielib.DefaultCookiePolicy):
def set_ok(self, cookie, request):
if not cookielib.DefaultCookiePolicy.set_ok(self, cookie, request):
return False
if i_dont_want_to_store_this_cookie(cookie):
return False
return True
<
In addition to the features required to implement the CookiePolicy
interface, this class allows you to block and allow domains from setting and
receiving cookies. There are also some strictness switches that allow you to
tighten up the rather loose Netscape protocol rules a little bit (at the cost of
blocking some benign cookies).
A domain blacklist and whitelist is provided (both off by default). Only domains
not in the blacklist and present in the whitelist (if the whitelist is active)
participate in cookie setting and returning. Use the {blocked_domains}
constructor argument, and blocked_domains and
set_blocked_domains methods (and the corresponding argument and methods
for {allowed_domains}). If you set a whitelist, you can turn it off again by
setting it to None.
Domains in block or allow lists that do not start with a dot must equal the
cookie domain to be matched. For example, ``"example.com"`` matches a blacklist
entry of ``"example.com"``, but ``"www.example.com"`` does not. Domains that do
start with a dot are matched by more specific domains too. For example, both
``"www.example.com"`` and ``"www.coyote.example.com"`` match ``".example.com"``
(but ``"example.com"`` itself does not). IP addresses are an exception, and
must match exactly. For example, if blocked_domains contains ``"192.168.1.2"``
and ``".168.1.2"``, 192.168.1.2 is blocked, but 193.168.1.2 is not.
DefaultCookiePolicy implements the following additional methods:
DefaultCookiePolicy.blocked_domains()~
Return the sequence of blocked domains (as a tuple).
DefaultCookiePolicy.set_blocked_domains(blocked_domains)~
Set the sequence of blocked domains.
DefaultCookiePolicy.is_blocked(domain)~
Return whether {domain} is on the blacklist for setting or receiving cookies.
DefaultCookiePolicy.allowed_domains()~
Return None, or the sequence of allowed domains (as a tuple).
DefaultCookiePolicy.set_allowed_domains(allowed_domains)~
Set the sequence of allowed domains, or None.
DefaultCookiePolicy.is_not_allowed(domain)~
Return whether {domain} is not on the whitelist for setting or receiving
cookies.
DefaultCookiePolicy instances have the following attributes, which are
all initialised from the constructor arguments of the same name, and which may
all be assigned to.
DefaultCookiePolicy.rfc2109_as_netscape~
If true, request that the CookieJar instance downgrade RFC 2109 cookies
(ie. cookies received in a Set-Cookie header with a version
cookie-attribute of 1) to Netscape cookies by setting the version attribute of
the Cookie (|py2stdlib-cookie|) instance to 0. The default value is None, in which
case RFC 2109 cookies are downgraded if and only if RFC 2965 handling is turned
off. Therefore, RFC 2109 cookies are downgraded by default.
.. versionadded:: 2.5
General strictness switches:
DefaultCookiePolicy.strict_domain~
Don't allow sites to set two-component domains with country-code top-level
domains like ``.co.uk``, ``.gov.uk``, ``.co.nz``.etc. This is far from perfect
and isn't guaranteed to work!
RFC 2965 protocol strictness switches:
DefaultCookiePolicy.strict_rfc2965_unverifiable~
Follow RFC 2965 rules on unverifiable transactions (usually, an unverifiable
transaction is one resulting from a redirect or a request for an image hosted on
another site). If this is false, cookies are {never} blocked on the basis of
verifiability
Netscape protocol strictness switches:
DefaultCookiePolicy.strict_ns_unverifiable~
apply RFC 2965 rules on unverifiable transactions even to Netscape cookies
DefaultCookiePolicy.strict_ns_domain~
Flags indicating how strict to be with domain-matching rules for Netscape
cookies. See below for acceptable values.
DefaultCookiePolicy.strict_ns_set_initial_dollar~
Ignore cookies in Set-Cookie: headers that have names starting with ``'$'``.
DefaultCookiePolicy.strict_ns_set_path~
Don't allow setting cookies whose path doesn't path-match request URI.
strict_ns_domain is a collection of flags. Its value is constructed by
or-ing together (for example, ``DomainStrictNoDots|DomainStrictNonDomain`` means
both flags are set).
DefaultCookiePolicy.DomainStrictNoDots~
When setting cookies, the 'host prefix' must not contain a dot (eg.
``www.foo.bar.com`` can't set a cookie for ``.bar.com``, because ``www.foo``
contains a dot).
DefaultCookiePolicy.DomainStrictNonDomain~
Cookies that did not explicitly specify a ``domain`` cookie-attribute can only
be returned to a domain equal to the domain that set the cookie (eg.
``spam.example.com`` won't be returned cookies from ``example.com`` that had no
``domain`` cookie-attribute).
DefaultCookiePolicy.DomainRFC2965Match~
When setting cookies, require a full RFC 2965 domain-match.
The following attributes are provided for convenience, and are the most useful
combinations of the above flags:
DefaultCookiePolicy.DomainLiberal~
Equivalent to 0 (ie. all of the above Netscape domain strictness flags switched
off).
DefaultCookiePolicy.DomainStrict~
Equivalent to ``DomainStrictNoDots|DomainStrictNonDomain``.
Cookie Objects
--------------
Cookie (|py2stdlib-cookie|) instances have Python attributes roughly corresponding to the
standard cookie-attributes specified in the various cookie standards. The
correspondence is not one-to-one, because there are complicated rules for
assigning default values, because the ``max-age`` and ``expires``
cookie-attributes contain equivalent information, and because RFC 2109 cookies
may be 'downgraded' by cookielib (|py2stdlib-cookielib|) from version 1 to version 0 (Netscape)
cookies.
Assignment to these attributes should not be necessary other than in rare
circumstances in a CookiePolicy method. The class does not enforce
internal consistency, so you should know what you're doing if you do that.
Cookie.version~
Integer or None. Netscape cookies have version 0. RFC 2965 and
RFC 2109 cookies have a ``version`` cookie-attribute of 1. However, note that
cookielib (|py2stdlib-cookielib|) may 'downgrade' RFC 2109 cookies to Netscape cookies, in which
case version is 0.
Cookie.name~
Cookie name (a string).
Cookie.value~
Cookie value (a string), or None.
Cookie.port~
String representing a port or a set of ports (eg. '80', or '80,8080'), or
None.
Cookie.path~
Cookie path (a string, eg. ``'/acme/rocket_launchers'``).
Cookie.secure~
True if cookie should only be returned over a secure connection.
Cookie.expires~
Integer expiry date in seconds since epoch, or None. See also the
is_expired method.
Cookie.discard~
True if this is a session cookie.
Cookie.comment~
String comment from the server explaining the function of this cookie, or
None.
Cookie.comment_url~
URL linking to a comment from the server explaining the function of this cookie,
or None.
Cookie.rfc2109~
True if this cookie was received as an RFC 2109 cookie (ie. the cookie
arrived in a Set-Cookie header, and the value of the Version
cookie-attribute in that header was 1). This attribute is provided because
cookielib (|py2stdlib-cookielib|) may 'downgrade' RFC 2109 cookies to Netscape cookies, in
which case version is 0.
.. versionadded:: 2.5
Cookie.port_specified~
True if a port or set of ports was explicitly specified by the server (in the
Set-Cookie / Set-Cookie2 header).
Cookie.domain_specified~
True if a domain was explicitly specified by the server.
Cookie.domain_initial_dot~
True if the domain explicitly specified by the server began with a dot
(``'.'``).
Cookies may have additional non-standard cookie-attributes. These may be
accessed using the following methods:
Cookie.has_nonstandard_attr(name)~
Return true if cookie has the named cookie-attribute.
Cookie.get_nonstandard_attr(name, default=None)~
If cookie has the named cookie-attribute, return its value. Otherwise, return
{default}.
Cookie.set_nonstandard_attr(name, value)~
Set the value of the named cookie-attribute.
The Cookie (|py2stdlib-cookie|) class also defines the following method:
Cookie.is_expired([now=None])~
True if cookie has passed the time at which the server requested it should
expire. If {now} is given (in seconds since the epoch), return whether the
cookie has expired at the specified time.
Examples
--------
The first example shows the most common usage of cookielib (|py2stdlib-cookielib|):: >
import cookielib, urllib2
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
r = opener.open("http://example.com/")
<
This example illustrates how to open a URL using your Netscape, Mozilla, or Lynx
cookies (assumes Unix/Netscape convention for location of the cookies file):: >
import os, cookielib, urllib2
cj = cookielib.MozillaCookieJar()
cj.load(os.path.join(os.environ["HOME"], ".netscape/cookies.txt"))
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
r = opener.open("http://example.com/")
<
The next example illustrates the use of DefaultCookiePolicy. Turn on
RFC 2965 cookies, be more strict about domains when setting and returning
Netscape cookies, and block some domains from setting cookies or having them
returned:: >
import urllib2
from cookielib import CookieJar, DefaultCookiePolicy
policy = DefaultCookiePolicy(
rfc2965=True, strict_ns_domain=DefaultCookiePolicy.DomainStrict,
blocked_domains=["ads.net", ".ads.net"])
cj = CookieJar(policy)
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
r = opener.open("http://example.com/")
==============================================================================
*py2stdlib-copy*
copy~
:synopsis: Shallow and deep copy operations.
This module provides generic (shallow and deep) copying operations.
Interface summary:
copy(x)~
Return a shallow copy of {x}.
deepcopy(x)~
Return a deep copy of {x}.
error~
Raised for module specific errors.
The difference between shallow and deep copying is only relevant for compound
objects (objects that contain other objects, like lists or class instances):
{ A }shallow copy* constructs a new compound object and then (to the extent
possible) inserts {references} into it to the objects found in the original.
{ A }deep copy* constructs a new compound object and then, recursively, inserts
{copies} into it of the objects found in the original.
Two problems often exist with deep copy operations that don't exist with shallow
copy operations:
* Recursive objects (compound objects that, directly or indirectly, contain a
reference to themselves) may cause a recursive loop.
{ Because deep copy copies }everything* it may copy too much, e.g.,
administrative data structures that should be shared even between copies.
The deepcopy function avoids these problems by:
* keeping a "memo" dictionary of objects already copied during the current
copying pass; and
* letting user-defined classes override the copying operation or the set of
components copied.
This module does not copy types like module, method, stack trace, stack frame,
file, socket, window, array, or any similar types. It does "copy" functions and
classes (shallow and deeply), by returning the original object unchanged; this
is compatible with the way these are treated by the pickle (|py2stdlib-pickle|) module.
Shallow copies of dictionaries can be made using dict.copy, and
of lists by assigning a slice of the entire list, for example,
``copied_list = original_list[:]``.
.. versionchanged:: 2.5
Added copying functions.
.. index:: module: pickle
Classes can use the same interfaces to control copying that they use to control
pickling. See the description of module pickle (|py2stdlib-pickle|) for information on these
methods. The copy (|py2stdlib-copy|) module does not use the copy_reg (|py2stdlib-copy_reg|) registration
module.
.. index::
single: __copy__() (copy protocol)
single: __deepcopy__() (copy protocol)
In order for a class to define its own copy implementation, it can define
special methods __copy__ and __deepcopy__. The former is called
to implement the shallow copy operation; no additional arguments are passed.
The latter is called to implement the deep copy operation; it is passed one
argument, the memo dictionary. If the __deepcopy__ implementation needs
to make a deep copy of a component, it should call the deepcopy function
with the component as first argument and the memo dictionary as second argument.
.. seealso::
Module pickle (|py2stdlib-pickle|)
Discussion of the special methods used to support object state retrieval and
restoration.
==============================================================================
*py2stdlib-copy_reg*
copy_reg~
:synopsis: Register pickle support functions.
.. note::
The copy_reg (|py2stdlib-copy_reg|) module has been renamed to copyreg in Python 3.0.
The 2to3 tool will automatically adapt imports when converting your
sources to 3.0.
.. index::
module: pickle
module: cPickle
module: copy
The copy_reg (|py2stdlib-copy_reg|) module provides support for the pickle (|py2stdlib-pickle|) and
cPickle (|py2stdlib-cpickle|) modules. The copy (|py2stdlib-copy|) module is likely to use this in the
future as well. It provides configuration information about object constructors
which are not classes. Such constructors may be factory functions or class
instances.
constructor(object)~
Declares {object} to be a valid constructor. If {object} is not callable (and
hence not valid as a constructor), raises TypeError.
pickle(type, function[, constructor])~
Declares that {function} should be used as a "reduction" function for objects of
type {type}; {type} must not be a "classic" class object. (Classic classes are
handled differently; see the documentation for the pickle (|py2stdlib-pickle|) module for
details.) {function} should return either a string or a tuple containing two or
three elements.
The optional {constructor} parameter, if provided, is a callable object which
can be used to reconstruct the object when called with the tuple of arguments
returned by {function} at pickling time. TypeError will be raised if
{object} is a class or {constructor} is not callable.
See the pickle (|py2stdlib-pickle|) module for more details on the interface expected of
{function} and {constructor}.
==============================================================================
*py2stdlib-crypt*
crypt~
:platform: Unix
:synopsis: The crypt() function used to check Unix passwords.
.. index::
single: crypt(3)
pair: cipher; DES
This module implements an interface to the crypt(3) routine, which is
a one-way hash function based upon a modified DES algorithm; see the Unix man
page for further details. Possible uses include allowing Python scripts to
accept typed passwords from the user, or attempting to crack Unix passwords with
a dictionary.
.. index:: single: crypt(3)
Notice that the behavior of this module depends on the actual implementation of
the crypt(3) routine in the running system. Therefore, any
extensions available on the current implementation will also be available on
this module.
crypt(word, salt)~
{word} will usually be a user's password as typed at a prompt or in a graphical
interface. {salt} is usually a random two-character string which will be used
to perturb the DES algorithm in one of 4096 ways. The characters in {salt} must
be in the set ``[./a-zA-Z0-9]``. Returns the hashed password as a string, which
will be composed of characters from the same alphabet as the salt (the first two
characters represent the salt itself).
.. index:: single: crypt(3)
Since a few crypt(3) extensions allow different values, with
different sizes in the {salt}, it is recommended to use the full crypted
password as salt when checking for a password.
A simple example illustrating typical use:: >
import crypt, getpass, pwd
def login():
username = raw_input('Python login:')
cryptedpasswd = pwd.getpwnam(username)[1]
if cryptedpasswd:
if cryptedpasswd == 'x' or cryptedpasswd == '*':
raise NotImplementedError(
"Sorry, currently no support for shadow passwords")
cleartext = getpass.getpass()
return crypt.crypt(cleartext, cryptedpasswd) == cryptedpasswd
else:
return 1
==============================================================================
*py2stdlib-csv*
csv~
:synopsis: Write and read tabular data to and from delimited files.
.. versionadded:: 2.3
.. index::
single: csv
pair: data; tabular
The so-called CSV (Comma Separated Values) format is the most common import and
export format for spreadsheets and databases. There is no "CSV standard", so
the format is operationally defined by the many applications which read and
write it. The lack of a standard means that subtle differences often exist in
the data produced and consumed by different applications. These differences can
make it annoying to process CSV files from multiple sources. Still, while the
delimiters and quoting characters vary, the overall format is similar enough
that it is possible to write a single module which can efficiently manipulate
such data, hiding the details of reading and writing the data from the
programmer.
The csv (|py2stdlib-csv|) module implements classes to read and write tabular data in CSV
format. It allows programmers to say, "write this data in the format preferred
by Excel," or "read data from this file which was generated by Excel," without
knowing the precise details of the CSV format used by Excel. Programmers can
also describe the CSV formats understood by other applications or define their
own special-purpose CSV formats.
The csv (|py2stdlib-csv|) module's reader and writer objects read and
write sequences. Programmers can also read and write data in dictionary form
using the DictReader and DictWriter classes.
.. note::
This version of the csv (|py2stdlib-csv|) module doesn't support Unicode input. Also,
there are currently some issues regarding ASCII NUL characters. Accordingly,
all input should be UTF-8 or printable ASCII to be safe; see the examples in
section csv-examples. These restrictions will be removed in the future.
.. seealso::
305 - CSV File API
The Python Enhancement Proposal which proposed this addition to Python.
Module Contents
---------------
The csv (|py2stdlib-csv|) module defines the following functions:
reader(csvfile[, dialect='excel'][, fmtparam])~
Return a reader object which will iterate over lines in the given {csvfile}.
{csvfile} can be any object which supports the iterator protocol and returns a
string each time its !next method is called --- file objects and list
objects are both suitable. If {csvfile} is a file object, it must be opened
with the 'b' flag on platforms where that makes a difference. An optional
{dialect} parameter can be given which is used to define a set of parameters
specific to a particular CSV dialect. It may be an instance of a subclass of
the Dialect class or one of the strings returned by the
list_dialects function. The other optional {fmtparam} keyword arguments
can be given to override individual formatting parameters in the current
dialect. For full details about the dialect and formatting parameters, see
section csv-fmt-params.
Each row read from the csv file is returned as a list of strings. No
automatic data type conversion is performed.
A short usage example:: >
>>> import csv
>>> spamReader = csv.reader(open('eggs.csv'), delimiter=' ', quotechar='|')
>>> for row in spamReader:
... print ', '.join(row)
Spam, Spam, Spam, Spam, Spam, Baked Beans
Spam, Lovely Spam, Wonderful Spam
<
.. versionchanged:: 2.5
The parser is now stricter with respect to multi-line quoted fields. Previously,
if a line ended within a quoted field without a terminating newline character, a
newline would be inserted into the returned field. This behavior caused problems
when reading files which contained carriage return characters within fields.
The behavior was changed to return the field without inserting newlines. As a
consequence, if newlines embedded within fields are important, the input should
be split into lines in a manner which preserves the newline characters.
writer(csvfile[, dialect='excel'][, fmtparam])~
Return a writer object responsible for converting the user's data into delimited
strings on the given file-like object. {csvfile} can be any object with a
write method. If {csvfile} is a file object, it must be opened with the
'b' flag on platforms where that makes a difference. An optional {dialect}
parameter can be given which is used to define a set of parameters specific to a
particular CSV dialect. It may be an instance of a subclass of the
Dialect class or one of the strings returned by the
list_dialects function. The other optional {fmtparam} keyword arguments
can be given to override individual formatting parameters in the current
dialect. For full details about the dialect and formatting parameters, see
section csv-fmt-params. To make it
as easy as possible to interface with modules which implement the DB API, the
value None is written as the empty string. While this isn't a
reversible transformation, it makes it easier to dump SQL NULL data values to
CSV files without preprocessing the data returned from a ``cursor.fetch*`` call.
All other non-string data are stringified with str before being written.
A short usage example:: >
>>> import csv
>>> spamWriter = csv.writer(open('eggs.csv', 'w'), delimiter=' ',
... quotechar='|', quoting=csv.QUOTE_MINIMAL)
>>> spamWriter.writerow(['Spam'] * 5 + ['Baked Beans'])
>>> spamWriter.writerow(['Spam', 'Lovely Spam', 'Wonderful Spam'])
<
register_dialect(name[, dialect][, fmtparam])~
Associate {dialect} with {name}. {name} must be a string or Unicode object. The
dialect can be specified either by passing a sub-class of Dialect, or
by {fmtparam} keyword arguments, or both, with keyword arguments overriding
parameters of the dialect. For full details about the dialect and formatting
parameters, see section csv-fmt-params.
unregister_dialect(name)~
Delete the dialect associated with {name} from the dialect registry. An
Error is raised if {name} is not a registered dialect name.
get_dialect(name)~
Return the dialect associated with {name}. An Error is raised if {name}
is not a registered dialect name.
.. versionchanged:: 2.5
This function now returns an immutable Dialect. Previously an
instance of the requested dialect was returned. Users could modify the
underlying class, changing the behavior of active readers and writers.
list_dialects()~
Return the names of all registered dialects.
field_size_limit([new_limit])~
Returns the current maximum field size allowed by the parser. If {new_limit} is
given, this becomes the new limit.
.. versionadded:: 2.5
The csv (|py2stdlib-csv|) module defines the following classes:
DictReader(csvfile[, fieldnames=None[, restkey=None[, restval=None[, dialect='excel'[, {args, }*kwds]]]]])~
Create an object which operates like a regular reader but maps the information
read into a dict whose keys are given by the optional {fieldnames} parameter.
If the {fieldnames} parameter is omitted, the values in the first row of the
{csvfile} will be used as the fieldnames. If the row read has more fields
than the fieldnames sequence, the remaining data is added as a sequence
keyed by the value of {restkey}. If the row read has fewer fields than the
fieldnames sequence, the remaining keys take the value of the optional
{restval} parameter. Any other optional or keyword arguments are passed to
the underlying reader instance.
DictWriter(csvfile, fieldnames[, restval=''[, extrasaction='raise'[, dialect='excel'[, {args, }*kwds]]]])~
Create an object which operates like a regular writer but maps dictionaries onto
output rows. The {fieldnames} parameter identifies the order in which values in
the dictionary passed to the writerow method are written to the
{csvfile}. The optional {restval} parameter specifies the value to be written
if the dictionary is missing a key in {fieldnames}. If the dictionary passed to
the writerow method contains a key not found in {fieldnames}, the
optional {extrasaction} parameter indicates what action to take. If it is set
to ``'raise'`` a ValueError is raised. If it is set to ``'ignore'``,
extra values in the dictionary are ignored. Any other optional or keyword
arguments are passed to the underlying writer instance.
Note that unlike the DictReader class, the {fieldnames} parameter of
the DictWriter is not optional. Since Python's dict objects
are not ordered, there is not enough information available to deduce the order
in which the row should be written to the {csvfile}.
Dialect~
The Dialect class is a container class relied on primarily for its
attributes, which are used to define the parameters for a specific
reader or writer instance.
excel()~
The excel class defines the usual properties of an Excel-generated CSV
file. It is registered with the dialect name ``'excel'``.
excel_tab()~
The excel_tab class defines the usual properties of an Excel-generated
TAB-delimited file. It is registered with the dialect name ``'excel-tab'``.
Sniffer()~
The Sniffer class is used to deduce the format of a CSV file.
The Sniffer class provides two methods:
sniff(sample[, delimiters=None])~
Analyze the given {sample} and return a Dialect subclass
reflecting the parameters found. If the optional {delimiters} parameter
is given, it is interpreted as a string containing possible valid
delimiter characters.
has_header(sample)~
Analyze the sample text (presumed to be in CSV format) and return
True if the first row appears to be a series of column headers.
An example for Sniffer use:: >
csvfile = open("example.csv")
dialect = csv.Sniffer().sniff(csvfile.read(1024))
csvfile.seek(0)
reader = csv.reader(csvfile, dialect)
# ... process CSV file contents here ...
<
The csv (|py2stdlib-csv|) module defines the following constants:
QUOTE_ALL~
Instructs writer objects to quote all fields.
QUOTE_MINIMAL~
Instructs writer objects to only quote those fields which contain
special characters such as {delimiter}, {quotechar} or any of the characters in
{lineterminator}.
QUOTE_NONNUMERIC~
Instructs writer objects to quote all non-numeric fields.
Instructs the reader to convert all non-quoted fields to type {float}.
QUOTE_NONE~
Instructs writer objects to never quote fields. When the current
{delimiter} occurs in output data it is preceded by the current {escapechar}
character. If {escapechar} is not set, the writer will raise Error if
any characters that require escaping are encountered.
Instructs reader to perform no special processing of quote characters.
The csv (|py2stdlib-csv|) module defines the following exception:
Error~
Raised by any of the functions when an error is detected.
Dialects and Formatting Parameters
----------------------------------
To make it easier to specify the format of input and output records, specific
formatting parameters are grouped together into dialects. A dialect is a
subclass of the Dialect class having a set of specific methods and a
single validate method. When creating reader or
writer objects, the programmer can specify a string or a subclass of
the Dialect class as the dialect parameter. In addition to, or instead
of, the {dialect} parameter, the programmer can also specify individual
formatting parameters, which have the same names as the attributes defined below
for the Dialect class.
Dialects support the following attributes:
Dialect.delimiter~
A one-character string used to separate fields. It defaults to ``','``.
Dialect.doublequote~
Controls how instances of {quotechar} appearing inside a field should be
themselves be quoted. When True, the character is doubled. When
False, the {escapechar} is used as a prefix to the {quotechar}. It
defaults to True.
On output, if {doublequote} is False and no {escapechar} is set,
Error is raised if a {quotechar} is found in a field.
Dialect.escapechar~
A one-character string used by the writer to escape the {delimiter} if {quoting}
is set to QUOTE_NONE and the {quotechar} if {doublequote} is
False. On reading, the {escapechar} removes any special meaning from
the following character. It defaults to None, which disables escaping.
Dialect.lineterminator~
The string used to terminate lines produced by the writer. It defaults
to ``'\r\n'``.
.. note:: >
The reader is hard-coded to recognise either ``'\r'`` or ``'\n'`` as
end-of-line, and ignores {lineterminator}. This behavior may change in the
future.
<
Dialect.quotechar~
A one-character string used to quote fields containing special characters, such
as the {delimiter} or {quotechar}, or which contain new-line characters. It
defaults to ``'"'``.
Dialect.quoting~
Controls when quotes should be generated by the writer and recognised by the
reader. It can take on any of the QUOTE_\* constants (see section
csv-contents) and defaults to QUOTE_MINIMAL.
Dialect.skipinitialspace~
When True, whitespace immediately following the {delimiter} is ignored.
The default is False.
Reader Objects
--------------
Reader objects (DictReader instances and objects returned by the
reader function) have the following public methods:
csvreader.next()~
Return the next row of the reader's iterable object as a list, parsed according
to the current dialect.
Reader objects have the following public attributes:
csvreader.dialect~
A read-only description of the dialect in use by the parser.
csvreader.line_num~
The number of lines read from the source iterator. This is not the same as the
number of records returned, as records can span multiple lines.
.. versionadded:: 2.5
DictReader objects have the following public attribute:
csvreader.fieldnames~
If not passed as a parameter when creating the object, this attribute is
initialized upon first access or when the first record is read from the
file.
.. versionchanged:: 2.6
Writer Objects
--------------
Writer objects (DictWriter instances and objects returned by
the writer function) have the following public methods. A {row} must be
a sequence of strings or numbers for Writer objects and a dictionary
mapping fieldnames to strings or numbers (by passing them through str
first) for DictWriter objects. Note that complex numbers are written
out surrounded by parens. This may cause some problems for other programs which
read CSV files (assuming they support complex numbers at all).
csvwriter.writerow(row)~
Write the {row} parameter to the writer's file object, formatted according to
the current dialect.
csvwriter.writerows(rows)~
Write all the {rows} parameters (a list of {row} objects as described above) to
the writer's file object, formatted according to the current dialect.
Writer objects have the following public attribute:
csvwriter.dialect~
A read-only description of the dialect in use by the writer.
DictWriter objects have the following public method:
DictWriter.writeheader()~
Write a row with the field names (as specified in the constructor).
.. versionadded:: 2.7
Examples
--------
The simplest example of reading a CSV file:: >
import csv
reader = csv.reader(open("some.csv", "rb"))
for row in reader:
print row
<
Reading a file with an alternate format::
import csv
reader = csv.reader(open("passwd", "rb"), delimiter=':', quoting=csv.QUOTE_NONE)
for row in reader:
print row
The corresponding simplest possible writing example is:: >
import csv
writer = csv.writer(open("some.csv", "wb"))
writer.writerows(someiterable)
<
Registering a new dialect::
import csv
csv.register_dialect('unixpwd', delimiter=':', quoting=csv.QUOTE_NONE)
reader = csv.reader(open("passwd", "rb"), 'unixpwd')
A slightly more advanced use of the reader --- catching and reporting errors:: >
import csv, sys
filename = "some.csv"
reader = csv.reader(open(filename, "rb"))
try:
for row in reader:
print row
except csv.Error, e:
sys.exit('file %s, line %d: %s' % (filename, reader.line_num, e))
<
And while the module doesn't directly support parsing strings, it can easily be
done:: >
import csv
for row in csv.reader(['one,two,three']):
print row
<
The csv (|py2stdlib-csv|) module doesn't directly support reading and writing Unicode, but
it is 8-bit-clean save for some problems with ASCII NUL characters. So you can
write functions or classes that handle the encoding and decoding for you as long
as you avoid encodings like UTF-16 that use NULs. UTF-8 is recommended.
unicode_csv_reader below is a generator that wraps csv.reader
to handle Unicode CSV data (a list of Unicode strings). utf_8_encoder
is a generator that encodes the Unicode strings as UTF-8, one string (or row) at
a time. The encoded strings are parsed by the CSV reader, and
unicode_csv_reader decodes the UTF-8-encoded cells back into Unicode:: >
import csv
def unicode_csv_reader(unicode_csv_data, dialect=csv.excel, {}kwargs):
# csv.py doesn't do Unicode; encode temporarily as UTF-8:
csv_reader = csv.reader(utf_8_encoder(unicode_csv_data),
dialect=dialect, {}kwargs)
for row in csv_reader:
# decode UTF-8 back to Unicode, cell by cell:
yield [unicode(cell, 'utf-8') for cell in row]
def utf_8_encoder(unicode_csv_data):
for line in unicode_csv_data:
yield line.encode('utf-8')
<
For all other encodings the following UnicodeReader and
UnicodeWriter classes can be used. They take an additional {encoding}
parameter in their constructor and make sure that the data passes the real
reader or writer encoded as UTF-8:: >
import csv, codecs, cStringIO
class UTF8Recoder:
"""
Iterator that reads an encoded stream and reencodes the input to UTF-8
"""
def __init__(self, f, encoding):
self.reader = codecs.getreader(encoding)(f)
def __iter__(self):
return self
def next(self):
return self.reader.next().encode("utf-8")
class UnicodeReader:
"""
A CSV reader which will iterate over lines in the CSV file "f",
which is encoded in the given encoding.
"""
def __init__(self, f, dialect=csv.excel, encoding="utf-8", {}kwds):
f = UTF8Recoder(f, encoding)
self.reader = csv.reader(f, dialect=dialect, {}kwds)
def next(self):
row = self.reader.next()
return [unicode(s, "utf-8") for s in row]
def __iter__(self):
return self
class UnicodeWriter:
"""
A CSV writer which will write rows to CSV file "f",
which is encoded in the given encoding.
"""
def __init__(self, f, dialect=csv.excel, encoding="utf-8", {}kwds):
# Redirect output to a queue
self.queue = cStringIO.StringIO()
self.writer = csv.writer(self.queue, dialect=dialect, {}kwds)
self.stream = f
self.encoder = codecs.getincrementalencoder(encoding)()
def writerow(self, row):
self.writer.writerow([s.encode("utf-8") for s in row])
# Fetch UTF-8 output from the queue ...
data = self.queue.getvalue()
data = data.decode("utf-8")
# ... and reencode it into the target encoding
data = self.encoder.encode(data)
# write to the target stream
self.stream.write(data)
# empty queue
self.queue.truncate(0)
def writerows(self, rows):
for row in rows:
self.writerow(row)
==============================================================================
*py2stdlib-ctypes*
ctypes~
:synopsis: A foreign function library for Python.
.. versionadded:: 2.5
ctypes (|py2stdlib-ctypes|) is a foreign function library for Python. It provides C compatible
data types, and allows calling functions in DLLs or shared libraries. It can be
used to wrap these libraries in pure Python.
ctypes tutorial
---------------
Note: The code samples in this tutorial use doctest (|py2stdlib-doctest|) to make sure that
they actually work. Since some code samples behave differently under Linux,
Windows, or Mac OS X, they contain doctest directives in comments.
Note: Some code samples reference the ctypes c_int type. This type is
an alias for the c_long type on 32-bit systems. So, you should not be
confused if c_long is printed if you would expect c_int ---
they are actually the same type.
Loading dynamic link libraries
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ctypes (|py2stdlib-ctypes|) exports the {cdll}, and on Windows {windll} and {oledll}
objects, for loading dynamic link libraries.
You load libraries by accessing them as attributes of these objects. {cdll}
loads libraries which export functions using the standard ``cdecl`` calling
convention, while {windll} libraries call functions using the ``stdcall``
calling convention. {oledll} also uses the ``stdcall`` calling convention, and
assumes the functions return a Windows HRESULT error code. The error
code is used to automatically raise a WindowsError exception when the
function call fails.
Here are some examples for Windows. Note that ``msvcrt`` is the MS standard C
library containing most standard C functions, and uses the cdecl calling
convention:: >
>>> from ctypes import *
>>> print windll.kernel32 # doctest: +WINDOWS
<WinDLL 'kernel32', handle ... at ...>
>>> print cdll.msvcrt # doctest: +WINDOWS
<CDLL 'msvcrt', handle ... at ...>
>>> libc = cdll.msvcrt # doctest: +WINDOWS
>>>
<
Windows appends the usual ``.dll`` file suffix automatically.
On Linux, it is required to specify the filename {including} the extension to
load a library, so attribute access can not be used to load libraries. Either the
LoadLibrary method of the dll loaders should be used, or you should load
the library by creating an instance of CDLL by calling the constructor:: >
>>> cdll.LoadLibrary("libc.so.6") # doctest: +LINUX
<CDLL 'libc.so.6', handle ... at ...>
>>> libc = CDLL("libc.so.6") # doctest: +LINUX
>>> libc # doctest: +LINUX
<CDLL 'libc.so.6', handle ... at ...>
>>>
<
.. XXX Add section for Mac OS X.
Accessing functions from loaded dlls
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Functions are accessed as attributes of dll objects:: >
>>> from ctypes import *
>>> libc.printf
<_FuncPtr object at 0x...>
>>> print windll.kernel32.GetModuleHandleA # doctest: +WINDOWS
<_FuncPtr object at 0x...>
>>> print windll.kernel32.MyOwnFunction # doctest: +WINDOWS
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "ctypes.py", line 239, in __getattr__
func = _StdcallFuncPtr(name, self)
AttributeError: function 'MyOwnFunction' not found
>>>
<
Note that win32 system dlls like ``kernel32`` and ``user32`` often export ANSI
as well as UNICODE versions of a function. The UNICODE version is exported with
an ``W`` appended to the name, while the ANSI version is exported with an ``A``
appended to the name. The win32 ``GetModuleHandle`` function, which returns a
{module handle} for a given module name, has the following C prototype, and a
macro is used to expose one of them as ``GetModuleHandle`` depending on whether
UNICODE is defined or not:: >
/{ ANSI version }/
HMODULE GetModuleHandleA(LPCSTR lpModuleName);
/{ UNICODE version }/
HMODULE GetModuleHandleW(LPCWSTR lpModuleName);
<
{windll} does not try to select one of them by magic, you must access the
version you need by specifying ``GetModuleHandleA`` or ``GetModuleHandleW``
explicitly, and then call it with strings or unicode strings
respectively.
Sometimes, dlls export functions with names which aren't valid Python
identifiers, like ``"??2@YAPAXI@Z"``. In this case you have to use
getattr to retrieve the function:: >
>>> getattr(cdll.msvcrt, "??2@YAPAXI@Z") # doctest: +WINDOWS
<_FuncPtr object at 0x...>
>>>
<
On Windows, some dlls export functions not by name but by ordinal. These
functions can be accessed by indexing the dll object with the ordinal number:: >
>>> cdll.kernel32[1] # doctest: +WINDOWS
<_FuncPtr object at 0x...>
>>> cdll.kernel32[0] # doctest: +WINDOWS
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "ctypes.py", line 310, in __getitem__
func = _StdcallFuncPtr(name, self)
AttributeError: function ordinal 0 not found
>>>
<
Calling functions
You can call these functions like any other Python callable. This example uses
the ``time()`` function, which returns system time in seconds since the Unix
epoch, and the ``GetModuleHandleA()`` function, which returns a win32 module
handle.
This example calls both functions with a NULL pointer (``None`` should be used
as the NULL pointer):: >
>>> print libc.time(None) # doctest: +SKIP
1150640792
>>> print hex(windll.kernel32.GetModuleHandleA(None)) # doctest: +WINDOWS
0x1d000000
>>>
<
ctypes (|py2stdlib-ctypes|) tries to protect you from calling functions with the wrong number
of arguments or the wrong calling convention. Unfortunately this only works on
Windows. It does this by examining the stack after the function returns, so
although an error is raised the function {has} been called:: >
>>> windll.kernel32.GetModuleHandleA() # doctest: +WINDOWS
Traceback (most recent call last):
File "<stdin>", line 1, in ?
ValueError: Procedure probably called with not enough arguments (4 bytes missing)
>>> windll.kernel32.GetModuleHandleA(0, 0) # doctest: +WINDOWS
Traceback (most recent call last):
File "<stdin>", line 1, in ?
ValueError: Procedure probably called with too many arguments (4 bytes in excess)
>>>
<
The same exception is raised when you call an ``stdcall`` function with the
``cdecl`` calling convention, or vice versa:: >
>>> cdll.kernel32.GetModuleHandleA(None) # doctest: +WINDOWS
Traceback (most recent call last):
File "<stdin>", line 1, in ?
ValueError: Procedure probably called with not enough arguments (4 bytes missing)
>>>
>>> windll.msvcrt.printf("spam") # doctest: +WINDOWS
Traceback (most recent call last):
File "<stdin>", line 1, in ?
ValueError: Procedure probably called with too many arguments (4 bytes in excess)
>>>
<
To find out the correct calling convention you have to look into the C header
file or the documentation for the function you want to call.
On Windows, ctypes (|py2stdlib-ctypes|) uses win32 structured exception handling to prevent
crashes from general protection faults when functions are called with invalid
argument values:: >
>>> windll.kernel32.GetModuleHandleA(32) # doctest: +WINDOWS
Traceback (most recent call last):
File "<stdin>", line 1, in ?
WindowsError: exception: access violation reading 0x00000020
>>>
<
There are, however, enough ways to crash Python with ctypes (|py2stdlib-ctypes|), so you
should be careful anyway.
``None``, integers, longs, byte strings and unicode strings are the only native
Python objects that can directly be used as parameters in these function calls.
``None`` is passed as a C ``NULL`` pointer, byte strings and unicode strings are
passed as pointer to the memory block that contains their data (char *
or wchar_t *). Python integers and Python longs are passed as the
platforms default C int type, their value is masked to fit into the C
type.
Before we move on calling functions with other parameter types, we have to learn
more about ctypes (|py2stdlib-ctypes|) data types.
Fundamental data types
^^^^^^^^^^^^^^^^^^^^^^
ctypes (|py2stdlib-ctypes|) defines a number of primitive C compatible data types :
+----------------------+----------------------------------------+----------------------------+
| ctypes type | C type | Python type |
+======================+========================================+============================+
| c_char | char | 1-character string |
+----------------------+----------------------------------------+----------------------------+
| c_wchar | wchar_t | 1-character unicode string |
+----------------------+----------------------------------------+----------------------------+
| c_byte | char | int/long |
+----------------------+----------------------------------------+----------------------------+
| c_ubyte | unsigned char | int/long |
+----------------------+----------------------------------------+----------------------------+
| c_short | short | int/long |
+----------------------+----------------------------------------+----------------------------+
| c_ushort | unsigned short | int/long |
+----------------------+----------------------------------------+----------------------------+
| c_int | int | int/long |
+----------------------+----------------------------------------+----------------------------+
| c_uint | unsigned int | int/long |
+----------------------+----------------------------------------+----------------------------+
| c_long | long | int/long |
+----------------------+----------------------------------------+----------------------------+
| c_ulong | unsigned long | int/long |
+----------------------+----------------------------------------+----------------------------+
| c_longlong | __int64 or long long | int/long |
+----------------------+----------------------------------------+----------------------------+
| c_ulonglong | unsigned __int64 or | int/long |
| | unsigned long long | |
+----------------------+----------------------------------------+----------------------------+
| c_float | float | float |
+----------------------+----------------------------------------+----------------------------+
| c_double | double | float |
+----------------------+----------------------------------------+----------------------------+
| c_longdouble| long double | float |
+----------------------+----------------------------------------+----------------------------+
| c_char_p | char * (NUL terminated) | string or ``None`` |
+----------------------+----------------------------------------+----------------------------+
| c_wchar_p | wchar_t * (NUL terminated) | unicode or ``None`` |
+----------------------+----------------------------------------+----------------------------+
| c_void_p | void * | int/long or ``None`` |
+----------------------+----------------------------------------+----------------------------+
All these types can be created by calling them with an optional initializer of
the correct type and value:: >
>>> c_int()
c_long(0)
>>> c_char_p("Hello, World")
c_char_p('Hello, World')
>>> c_ushort(-3)
c_ushort(65533)
>>>
<
Since these types are mutable, their value can also be changed afterwards::
>>> i = c_int(42)
>>> print i
c_long(42)
>>> print i.value
42
>>> i.value = -99
>>> print i.value
-99
>>>
Assigning a new value to instances of the pointer types c_char_p,
c_wchar_p, and c_void_p changes the {memory location} they
point to, {not the contents} of the memory block (of course not, because Python
strings are immutable):: >
>>> s = "Hello, World"
>>> c_s = c_char_p(s)
>>> print c_s
c_char_p('Hello, World')
>>> c_s.value = "Hi, there"
>>> print c_s
c_char_p('Hi, there')
>>> print s # first string is unchanged
Hello, World
>>>
<
You should be careful, however, not to pass them to functions expecting pointers
to mutable memory. If you need mutable memory blocks, ctypes has a
create_string_buffer function which creates these in various ways. The
current memory block contents can be accessed (or changed) with the ``raw``
property; if you want to access it as NUL terminated string, use the ``value``
property:: >
>>> from ctypes import *
>>> p = create_string_buffer(3) # create a 3 byte buffer, initialized to NUL bytes
>>> print sizeof(p), repr(p.raw)
3 '\x00\x00\x00'
>>> p = create_string_buffer("Hello") # create a buffer containing a NUL terminated string
>>> print sizeof(p), repr(p.raw)
6 'Hello\x00'
>>> print repr(p.value)
'Hello'
>>> p = create_string_buffer("Hello", 10) # create a 10 byte buffer
>>> print sizeof(p), repr(p.raw)
10 'Hello\x00\x00\x00\x00\x00'
>>> p.value = "Hi"
>>> print sizeof(p), repr(p.raw)
10 'Hi\x00lo\x00\x00\x00\x00\x00'
>>>
<
The create_string_buffer function replaces the c_buffer function
(which is still available as an alias), as well as the c_string function
from earlier ctypes releases. To create a mutable memory block containing
unicode characters of the C type wchar_t use the
create_unicode_buffer function.
Calling functions, continued
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Note that printf prints to the real standard output channel, {not} to
sys.stdout, so these examples will only work at the console prompt, not
from within {IDLE} or {PythonWin}:: >
>>> printf = libc.printf
>>> printf("Hello, %s\n", "World!")
Hello, World!
14
>>> printf("Hello, %S\n", u"World!")
Hello, World!
14
>>> printf("%d bottles of beer\n", 42)
42 bottles of beer
19
>>> printf("%f bottles of beer\n", 42.5)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
ArgumentError: argument 2: exceptions.TypeError: Don't know how to convert parameter 2
>>>
<
As has been mentioned before, all Python types except integers, strings, and
unicode strings have to be wrapped in their corresponding ctypes (|py2stdlib-ctypes|) type, so
that they can be converted to the required C data type:: >
>>> printf("An int %d, a double %f\n", 1234, c_double(3.14))
An int 1234, a double 3.140000
31
>>>
<
Calling functions with your own custom data types
You can also customize ctypes (|py2stdlib-ctypes|) argument conversion to allow instances of
your own classes be used as function arguments. ctypes (|py2stdlib-ctypes|) looks for an
_as_parameter_ attribute and uses this as the function argument. Of
course, it must be one of integer, string, or unicode:: >
>>> class Bottles(object):
... def __init__(self, number):
... self._as_parameter_ = number
...
>>> bottles = Bottles(42)
>>> printf("%d bottles of beer\n", bottles)
42 bottles of beer
19
>>>
<
If you don't want to store the instance's data in the _as_parameter_
instance variable, you could define a property which makes the data
available.
Specifying the required argument types (function prototypes)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
It is possible to specify the required argument types of functions exported from
DLLs by setting the argtypes attribute.
argtypes must be a sequence of C data types (the ``printf`` function is
probably not a good example here, because it takes a variable number and
different types of parameters depending on the format string, on the other hand
this is quite handy to experiment with this feature):: >
>>> printf.argtypes = [c_char_p, c_char_p, c_int, c_double]
>>> printf("String '%s', Int %d, Double %f\n", "Hi", 10, 2.2)
String 'Hi', Int 10, Double 2.200000
37
>>>
<
Specifying a format protects against incompatible argument types (just as a
prototype for a C function), and tries to convert the arguments to valid types:: >
>>> printf("%d %d %d", 1, 2, 3)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
ArgumentError: argument 2: exceptions.TypeError: wrong type
>>> printf("%s %d %f\n", "X", 2, 3)
X 2 3.000000
13
>>>
<
If you have defined your own classes which you pass to function calls, you have
to implement a from_param class method for them to be able to use them
in the argtypes sequence. The from_param class method receives
the Python object passed to the function call, it should do a typecheck or
whatever is needed to make sure this object is acceptable, and then return the
object itself, its _as_parameter_ attribute, or whatever you want to
pass as the C function argument in this case. Again, the result should be an
integer, string, unicode, a ctypes (|py2stdlib-ctypes|) instance, or an object with an
_as_parameter_ attribute.
Return types
^^^^^^^^^^^^
By default functions are assumed to return the C int type. Other
return types can be specified by setting the restype attribute of the
function object.
Here is a more advanced example, it uses the ``strchr`` function, which expects
a string pointer and a char, and returns a pointer to a string:: >
>>> strchr = libc.strchr
>>> strchr("abcdef", ord("d")) # doctest: +SKIP
8059983
>>> strchr.restype = c_char_p # c_char_p is a pointer to a string
>>> strchr("abcdef", ord("d"))
'def'
>>> print strchr("abcdef", ord("x"))
None
>>>
<
If you want to avoid the ``ord("x")`` calls above, you can set the
argtypes attribute, and the second argument will be converted from a
single character Python string into a C char:: >
>>> strchr.restype = c_char_p
>>> strchr.argtypes = [c_char_p, c_char]
>>> strchr("abcdef", "d")
'def'
>>> strchr("abcdef", "def")
Traceback (most recent call last):
File "<stdin>", line 1, in ?
ArgumentError: argument 2: exceptions.TypeError: one character string expected
>>> print strchr("abcdef", "x")
None
>>> strchr("abcdef", "d")
'def'
>>>
<
You can also use a callable Python object (a function or a class for example) as
the restype attribute, if the foreign function returns an integer. The
callable will be called with the {integer} the C function returns, and the
result of this call will be used as the result of your function call. This is
useful to check for error return values and automatically raise an exception:: >
>>> GetModuleHandle = windll.kernel32.GetModuleHandleA # doctest: +WINDOWS
>>> def ValidHandle(value):
... if value == 0:
... raise WinError()
... return value
...
>>>
>>> GetModuleHandle.restype = ValidHandle # doctest: +WINDOWS
>>> GetModuleHandle(None) # doctest: +WINDOWS
486539264
>>> GetModuleHandle("something silly") # doctest: +WINDOWS
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "<stdin>", line 3, in ValidHandle
WindowsError: [Errno 126] The specified module could not be found.
>>>
<
``WinError`` is a function which will call Windows ``FormatMessage()`` api to
get the string representation of an error code, and {returns} an exception.
``WinError`` takes an optional error code parameter, if no one is used, it calls
GetLastError to retrieve it.
Please note that a much more powerful error checking mechanism is available
through the errcheck attribute; see the reference manual for details.
Passing pointers (or: passing parameters by reference)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Sometimes a C api function expects a {pointer} to a data type as parameter,
probably to write into the corresponding location, or if the data is too large
to be passed by value. This is also known as {passing parameters by reference}.
ctypes (|py2stdlib-ctypes|) exports the byref function which is used to pass
parameters by reference. The same effect can be achieved with the
pointer function, although pointer does a lot more work since it
constructs a real pointer object, so it is faster to use byref if you
don't need the pointer object in Python itself:: >
>>> i = c_int()
>>> f = c_float()
>>> s = create_string_buffer('\000' * 32)
>>> print i.value, f.value, repr(s.value)
0 0.0 ''
>>> libc.sscanf("1 3.14 Hello", "%d %f %s",
... byref(i), byref(f), s)
3
>>> print i.value, f.value, repr(s.value)
1 3.1400001049 'Hello'
>>>
<
Structures and unions
Structures and unions must derive from the Structure and Union
base classes which are defined in the ctypes (|py2stdlib-ctypes|) module. Each subclass must
define a _fields_ attribute. _fields_ must be a list of
{2-tuples}, containing a {field name} and a {field type}.
The field type must be a ctypes (|py2stdlib-ctypes|) type like c_int, or any other
derived ctypes (|py2stdlib-ctypes|) type: structure, union, array, pointer.
Here is a simple example of a POINT structure, which contains two integers named
{x} and {y}, and also shows how to initialize a structure in the constructor:: >
>>> from ctypes import *
>>> class POINT(Structure):
... _fields_ = [("x", c_int),
... ("y", c_int)]
...
>>> point = POINT(10, 20)
>>> print point.x, point.y
10 20
>>> point = POINT(y=5)
>>> print point.x, point.y
0 5
>>> POINT(1, 2, 3)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
ValueError: too many initializers
>>>
<
You can, however, build much more complicated structures. Structures can itself
contain other structures by using a structure as a field type.
Here is a RECT structure which contains two POINTs named {upperleft} and
{lowerright}:: >
>>> class RECT(Structure):
... _fields_ = [("upperleft", POINT),
... ("lowerright", POINT)]
...
>>> rc = RECT(point)
>>> print rc.upperleft.x, rc.upperleft.y
0 5
>>> print rc.lowerright.x, rc.lowerright.y
0 0
>>>
<
Nested structures can also be initialized in the constructor in several ways::
>>> r = RECT(POINT(1, 2), POINT(3, 4))
>>> r = RECT((1, 2), (3, 4))
Field descriptor\s can be retrieved from the {class}, they are useful
for debugging because they can provide useful information:: >
>>> print POINT.x
<Field type=c_long, ofs=0, size=4>
>>> print POINT.y
<Field type=c_long, ofs=4, size=4>
>>>
<
Structure/union alignment and byte order
By default, Structure and Union fields are aligned in the same way the C
compiler does it. It is possible to override this behavior be specifying a
_pack_ class attribute in the subclass definition. This must be set to a
positive integer and specifies the maximum alignment for the fields. This is
what ``#pragma pack(n)`` also does in MSVC.
ctypes (|py2stdlib-ctypes|) uses the native byte order for Structures and Unions. To build
structures with non-native byte order, you can use one of the
BigEndianStructure, LittleEndianStructure,
BigEndianUnion, and LittleEndianUnion base classes. These
classes cannot contain pointer fields.
Bit fields in structures and unions
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
It is possible to create structures and unions containing bit fields. Bit fields
are only possible for integer fields, the bit width is specified as the third
item in the _fields_ tuples:: >
>>> class Int(Structure):
... _fields_ = [("first_16", c_int, 16),
... ("second_16", c_int, 16)]
...
>>> print Int.first_16
<Field type=c_long, ofs=0:0, bits=16>
>>> print Int.second_16
<Field type=c_long, ofs=0:16, bits=16>
>>>
<
Arrays
Arrays are sequences, containing a fixed number of instances of the same type.
The recommended way to create array types is by multiplying a data type with a
positive integer:: >
TenPointsArrayType = POINT * 10
<
Here is an example of an somewhat artificial data type, a structure containing 4
POINTs among other stuff:: >
>>> from ctypes import *
>>> class POINT(Structure):
... _fields_ = ("x", c_int), ("y", c_int)
...
>>> class MyStruct(Structure):
... _fields_ = [("a", c_int),
... ("b", c_float),
... ("point_array", POINT * 4)]
>>>
>>> print len(MyStruct().point_array)
4
>>>
<
Instances are created in the usual way, by calling the class::
arr = TenPointsArrayType()
for pt in arr:
print pt.x, pt.y
The above code print a series of ``0 0`` lines, because the array contents is
initialized to zeros.
Initializers of the correct type can also be specified:: >
>>> from ctypes import *
>>> TenIntegers = c_int * 10
>>> ii = TenIntegers(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
>>> print ii
<c_long_Array_10 object at 0x...>
>>> for i in ii: print i,
...
1 2 3 4 5 6 7 8 9 10
>>>
<
Pointers
Pointer instances are created by calling the pointer function on a
ctypes (|py2stdlib-ctypes|) type:: >
>>> from ctypes import *
>>> i = c_int(42)
>>> pi = pointer(i)
>>>
<
Pointer instances have a contents attribute which returns the object to
which the pointer points, the ``i`` object above:: >
>>> pi.contents
c_long(42)
>>>
<
Note that ctypes (|py2stdlib-ctypes|) does not have OOR (original object return), it constructs a
new, equivalent object each time you retrieve an attribute:: >
>>> pi.contents is i
False
>>> pi.contents is pi.contents
False
>>>
<
Assigning another c_int instance to the pointer's contents attribute
would cause the pointer to point to the memory location where this is stored:: >
>>> i = c_int(99)
>>> pi.contents = i
>>> pi.contents
c_long(99)
>>>
<
.. XXX Document dereferencing pointers, and that it is preferred over the
.contents attribute.
Pointer instances can also be indexed with integers:: >
>>> pi[0]
99
>>>
<
Assigning to an integer index changes the pointed to value::
>>> print i
c_long(99)
>>> pi[0] = 22
>>> print i
c_long(22)
>>>
It is also possible to use indexes different from 0, but you must know what
you're doing, just as in C: You can access or change arbitrary memory locations.
Generally you only use this feature if you receive a pointer from a C function,
and you {know} that the pointer actually points to an array instead of a single
item.
Behind the scenes, the pointer function does more than simply create
pointer instances, it has to create pointer {types} first. This is done with
the POINTER function, which accepts any ctypes (|py2stdlib-ctypes|) type, and returns
a new type:: >
>>> PI = POINTER(c_int)
>>> PI
<class 'ctypes.LP_c_long'>
>>> PI(42)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: expected c_long instead of int
>>> PI(c_int(42))
<ctypes.LP_c_long object at 0x...>
>>>
<
Calling the pointer type without an argument creates a ``NULL`` pointer.
``NULL`` pointers have a ``False`` boolean value:: >
>>> null_ptr = POINTER(c_int)()
>>> print bool(null_ptr)
False
>>>
<
ctypes (|py2stdlib-ctypes|) checks for ``NULL`` when dereferencing pointers (but dereferencing
invalid non-\ ``NULL`` pointers would crash Python):: >
>>> null_ptr[0]
Traceback (most recent call last):
ValueError: NULL pointer access
>>>
>>> null_ptr[0] = 1234
Traceback (most recent call last):
ValueError: NULL pointer access
>>>
<
Type conversions
Usually, ctypes does strict type checking. This means, if you have
``POINTER(c_int)`` in the argtypes list of a function or as the type of
a member field in a structure definition, only instances of exactly the same
type are accepted. There are some exceptions to this rule, where ctypes accepts
other objects. For example, you can pass compatible array instances instead of
pointer types. So, for ``POINTER(c_int)``, ctypes accepts an array of c_int:: >
>>> class Bar(Structure):
... _fields_ = [("count", c_int), ("values", POINTER(c_int))]
...
>>> bar = Bar()
>>> bar.values = (c_int * 3)(1, 2, 3)
>>> bar.count = 3
>>> for i in range(bar.count):
... print bar.values[i]
...
1
2
3
>>>
<
To set a POINTER type field to ``NULL``, you can assign ``None``::
>>> bar.values = None
>>>
.. XXX list other conversions...
Sometimes you have instances of incompatible types. In C, you can cast one type
into another type. ctypes (|py2stdlib-ctypes|) provides a cast function which can be
used in the same way. The ``Bar`` structure defined above accepts
``POINTER(c_int)`` pointers or c_int arrays for its ``values`` field,
but not instances of other types:: >
>>> bar.values = (c_byte * 4)()
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: incompatible types, c_byte_Array_4 instance instead of LP_c_long instance
>>>
<
For these cases, the cast function is handy.
The cast function can be used to cast a ctypes instance into a pointer
to a different ctypes data type. cast takes two parameters, a ctypes
object that is or can be converted to a pointer of some kind, and a ctypes
pointer type. It returns an instance of the second argument, which references
the same memory block as the first argument:: >
>>> a = (c_byte * 4)()
>>> cast(a, POINTER(c_int))
<ctypes.LP_c_long object at ...>
>>>
<
So, cast can be used to assign to the ``values`` field of ``Bar`` the
structure:: >
>>> bar = Bar()
>>> bar.values = cast((c_byte * 4)(), POINTER(c_int))
>>> print bar.values[0]
0
>>>
<
Incomplete Types
{Incomplete Types} are structures, unions or arrays whose members are not yet
specified. In C, they are specified by forward declarations, which are defined
later:: >
struct cell; /{ forward declaration }/
struct {
char *name;
struct cell *next;
} cell;
<
The straightforward translation into ctypes code would be this, but it does not
work:: >
>>> class cell(Structure):
... _fields_ = [("name", c_char_p),
... ("next", POINTER(cell))]
...
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "<stdin>", line 2, in cell
NameError: name 'cell' is not defined
>>>
<
because the new ``class cell`` is not available in the class statement itself.
In ctypes (|py2stdlib-ctypes|), we can define the ``cell`` class and set the _fields_
attribute later, after the class statement:: >
>>> from ctypes import *
>>> class cell(Structure):
... pass
...
>>> cell._fields_ = [("name", c_char_p),
... ("next", POINTER(cell))]
>>>
<
Lets try it. We create two instances of ``cell``, and let them point to each
other, and finally follow the pointer chain a few times:: >
>>> c1 = cell()
>>> c1.name = "foo"
>>> c2 = cell()
>>> c2.name = "bar"
>>> c1.next = pointer(c2)
>>> c2.next = pointer(c1)
>>> p = c1
>>> for i in range(8):
... print p.name,
... p = p.next[0]
...
foo bar foo bar foo bar foo bar
>>>
<
Callback functions
ctypes (|py2stdlib-ctypes|) allows to create C callable function pointers from Python callables.
These are sometimes called {callback functions}.
First, you must create a class for the callback function, the class knows the
calling convention, the return type, and the number and types of arguments this
function will receive.
The CFUNCTYPE factory function creates types for callback functions using the
normal cdecl calling convention, and, on Windows, the WINFUNCTYPE factory
function creates types for callback functions using the stdcall calling
convention.
Both of these factory functions are called with the result type as first
argument, and the callback functions expected argument types as the remaining
arguments.
I will present an example here which uses the standard C library's qsort
function, this is used to sort items with the help of a callback function.
qsort will be used to sort an array of integers:: >
>>> IntArray5 = c_int * 5
>>> ia = IntArray5(5, 1, 7, 33, 99)
>>> qsort = libc.qsort
>>> qsort.restype = None
>>>
<
qsort must be called with a pointer to the data to sort, the number of
items in the data array, the size of one item, and a pointer to the comparison
function, the callback. The callback will then be called with two pointers to
items, and it must return a negative integer if the first item is smaller than
the second, a zero if they are equal, and a positive integer else.
So our callback function receives pointers to integers, and must return an
integer. First we create the ``type`` for the callback function:: >
>>> CMPFUNC = CFUNCTYPE(c_int, POINTER(c_int), POINTER(c_int))
>>>
<
For the first implementation of the callback function, we simply print the
arguments we get, and return 0 (incremental development ;-):: >
>>> def py_cmp_func(a, b):
... print "py_cmp_func", a, b
... return 0
...
>>>
<
Create the C callable callback::
>>> cmp_func = CMPFUNC(py_cmp_func)
>>>
And we're ready to go:: >
>>> qsort(ia, len(ia), sizeof(c_int), cmp_func) # doctest: +WINDOWS
py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...>
py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...>
py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...>
py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...>
py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...>
py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...>
py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...>
py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...>
py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...>
py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...>
>>>
<
We know how to access the contents of a pointer, so lets redefine our callback::
>>> def py_cmp_func(a, b):
... print "py_cmp_func", a[0], b[0]
... return 0
...
>>> cmp_func = CMPFUNC(py_cmp_func)
>>>
Here is what we get on Windows:: >
>>> qsort(ia, len(ia), sizeof(c_int), cmp_func) # doctest: +WINDOWS
py_cmp_func 7 1
py_cmp_func 33 1
py_cmp_func 99 1
py_cmp_func 5 1
py_cmp_func 7 5
py_cmp_func 33 5
py_cmp_func 99 5
py_cmp_func 7 99
py_cmp_func 33 99
py_cmp_func 7 33
>>>
<
It is funny to see that on linux the sort function seems to work much more
efficiently, it is doing less comparisons:: >
>>> qsort(ia, len(ia), sizeof(c_int), cmp_func) # doctest: +LINUX
py_cmp_func 5 1
py_cmp_func 33 99
py_cmp_func 7 33
py_cmp_func 5 7
py_cmp_func 1 7
>>>
<
Ah, we're nearly done! The last step is to actually compare the two items and
return a useful result:: >
>>> def py_cmp_func(a, b):
... print "py_cmp_func", a[0], b[0]
... return a[0] - b[0]
...
>>>
<
Final run on Windows::
>>> qsort(ia, len(ia), sizeof(c_int), CMPFUNC(py_cmp_func)) # doctest: +WINDOWS
py_cmp_func 33 7
py_cmp_func 99 33
py_cmp_func 5 99
py_cmp_func 1 99
py_cmp_func 33 7
py_cmp_func 1 33
py_cmp_func 5 33
py_cmp_func 5 7
py_cmp_func 1 7
py_cmp_func 5 1
>>>
and on Linux:: >
>>> qsort(ia, len(ia), sizeof(c_int), CMPFUNC(py_cmp_func)) # doctest: +LINUX
py_cmp_func 5 1
py_cmp_func 33 99
py_cmp_func 7 33
py_cmp_func 1 7
py_cmp_func 5 7
>>>
<
It is quite interesting to see that the Windows qsort function needs
more comparisons than the linux version!
As we can easily check, our array is sorted now:: >
>>> for i in ia: print i,
...
1 5 7 33 99
>>>
<
{Important note for callback functions:}*
Make sure you keep references to CFUNCTYPE objects as long as they are used from
C code. ctypes (|py2stdlib-ctypes|) doesn't, and if you don't, they may be garbage collected,
crashing your program when a callback is made.
Accessing values exported from dlls
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Some shared libraries not only export functions, they also export variables. An
example in the Python library itself is the ``Py_OptimizeFlag``, an integer set
to 0, 1, or 2, depending on the -O or -OO flag given on
startup.
ctypes (|py2stdlib-ctypes|) can access values like this with the in_dll class methods of
the type. {pythonapi} is a predefined symbol giving access to the Python C
api:: >
>>> opt_flag = c_int.in_dll(pythonapi, "Py_OptimizeFlag")
>>> print opt_flag
c_long(0)
>>>
<
If the interpreter would have been started with -O, the sample would
have printed ``c_long(1)``, or ``c_long(2)`` if -OO would have been
specified.
An extended example which also demonstrates the use of pointers accesses the
``PyImport_FrozenModules`` pointer exported by Python.
Quoting the Python docs: *This pointer is initialized to point to an array of
"struct _frozen" records, terminated by one whose members are all NULL or zero.
When a frozen module is imported, it is searched in this table. Third-party code
could play tricks with this to provide a dynamically created collection of
frozen modules.*
So manipulating this pointer could even prove useful. To restrict the example
size, we show only how this table can be read with ctypes (|py2stdlib-ctypes|):: >
>>> from ctypes import *
>>>
>>> class struct_frozen(Structure):
... _fields_ = [("name", c_char_p),
... ("code", POINTER(c_ubyte)),
... ("size", c_int)]
...
>>>
<
We have defined the ``struct _frozen`` data type, so we can get the pointer to
the table:: >
>>> FrozenTable = POINTER(struct_frozen)
>>> table = FrozenTable.in_dll(pythonapi, "PyImport_FrozenModules")
>>>
<
Since ``table`` is a ``pointer`` to the array of ``struct_frozen`` records, we
can iterate over it, but we just have to make sure that our loop terminates,
because pointers have no size. Sooner or later it would probably crash with an
access violation or whatever, so it's better to break out of the loop when we
hit the NULL entry:: >
>>> for item in table:
... print item.name, item.size
... if item.name is None:
... break
...
__hello__ 104
__phello__ -104
__phello__.spam 104
None 0
>>>
<
The fact that standard Python has a frozen module and a frozen package
(indicated by the negative size member) is not well known, it is only used for
testing. Try it out with ``import __hello__`` for example.
Surprises
^^^^^^^^^
There are some edges in ctypes (|py2stdlib-ctypes|) where you may be expect something else than
what actually happens.
Consider the following example:: >
>>> from ctypes import *
>>> class POINT(Structure):
... _fields_ = ("x", c_int), ("y", c_int)
...
>>> class RECT(Structure):
... _fields_ = ("a", POINT), ("b", POINT)
...
>>> p1 = POINT(1, 2)
>>> p2 = POINT(3, 4)
>>> rc = RECT(p1, p2)
>>> print rc.a.x, rc.a.y, rc.b.x, rc.b.y
1 2 3 4
>>> # now swap the two points
>>> rc.a, rc.b = rc.b, rc.a
>>> print rc.a.x, rc.a.y, rc.b.x, rc.b.y
3 4 3 4
>>>
<
Hm. We certainly expected the last statement to print ``3 4 1 2``. What
happened? Here are the steps of the ``rc.a, rc.b = rc.b, rc.a`` line above:: >
>>> temp0, temp1 = rc.b, rc.a
>>> rc.a = temp0
>>> rc.b = temp1
>>>
<
Note that ``temp0`` and ``temp1`` are objects still using the internal buffer of
the ``rc`` object above. So executing ``rc.a = temp0`` copies the buffer
contents of ``temp0`` into ``rc`` 's buffer. This, in turn, changes the
contents of ``temp1``. So, the last assignment ``rc.b = temp1``, doesn't have
the expected effect.
Keep in mind that retrieving sub-objects from Structure, Unions, and Arrays
doesn't {copy} the sub-object, instead it retrieves a wrapper object accessing
the root-object's underlying buffer.
Another example that may behave different from what one would expect is this:: >
>>> s = c_char_p()
>>> s.value = "abc def ghi"
>>> s.value
'abc def ghi'
>>> s.value is s.value
False
>>>
<
Why is it printing ``False``? ctypes instances are objects containing a memory
block plus some descriptor\s accessing the contents of the memory.
Storing a Python object in the memory block does not store the object itself,
instead the ``contents`` of the object is stored. Accessing the contents again
constructs a new Python object each time!
Variable-sized data types
^^^^^^^^^^^^^^^^^^^^^^^^^
ctypes (|py2stdlib-ctypes|) provides some support for variable-sized arrays and structures.
The resize function can be used to resize the memory buffer of an
existing ctypes object. The function takes the object as first argument, and
the requested size in bytes as the second argument. The memory block cannot be
made smaller than the natural memory block specified by the objects type, a
ValueError is raised if this is tried:: >
>>> short_array = (c_short * 4)()
>>> print sizeof(short_array)
8
>>> resize(short_array, 4)
Traceback (most recent call last):
...
ValueError: minimum size is 8
>>> resize(short_array, 32)
>>> sizeof(short_array)
32
>>> sizeof(type(short_array))
8
>>>
<
This is nice and fine, but how would one access the additional elements
contained in this array? Since the type still only knows about 4 elements, we
get errors accessing other elements:: >
>>> short_array[:]
[0, 0, 0, 0]
>>> short_array[7]
Traceback (most recent call last):
...
IndexError: invalid index
>>>
<
Another way to use variable-sized data types with ctypes (|py2stdlib-ctypes|) is to use the
dynamic nature of Python, and (re-)define the data type after the required size
is already known, on a case by case basis.
ctypes reference
----------------
Finding shared libraries
^^^^^^^^^^^^^^^^^^^^^^^^
When programming in a compiled language, shared libraries are accessed when
compiling/linking a program, and when the program is run.
The purpose of the find_library function is to locate a library in a way
similar to what the compiler does (on platforms with several versions of a
shared library the most recent should be loaded), while the ctypes library
loaders act like when a program is run, and call the runtime loader directly.
The ctypes.util module provides a function which can help to determine the
library to load.
find_library(name)~
:module: ctypes.util
Try to find a library and return a pathname. {name} is the library name without
any prefix like {lib}, suffix like ``.so``, ``.dylib`` or version number (this
is the form used for the posix linker option -l). If no library can
be found, returns ``None``.
The exact functionality is system dependent.
On Linux, find_library tries to run external programs
(``/sbin/ldconfig``, ``gcc``, and ``objdump``) to find the library file. It
returns the filename of the library file. Here are some examples:: >
>>> from ctypes.util import find_library
>>> find_library("m")
'libm.so.6'
>>> find_library("c")
'libc.so.6'
>>> find_library("bz2")
'libbz2.so.1.0'
>>>
<
On OS X, find_library tries several predefined naming schemes and paths
to locate the library, and returns a full pathname if successful:: >
>>> from ctypes.util import find_library
>>> find_library("c")
'/usr/lib/libc.dylib'
>>> find_library("m")
'/usr/lib/libm.dylib'
>>> find_library("bz2")
'/usr/lib/libbz2.dylib'
>>> find_library("AGL")
'/System/Library/Frameworks/AGL.framework/AGL'
>>>
<
On Windows, find_library searches along the system search path, and
returns the full pathname, but since there is no predefined naming scheme a call
like ``find_library("c")`` will fail and return ``None``.
If wrapping a shared library with ctypes (|py2stdlib-ctypes|), it {may} be better to determine
the shared library name at development type, and hardcode that into the wrapper
module instead of using find_library to locate the library at runtime.
Loading shared libraries
^^^^^^^^^^^^^^^^^^^^^^^^
There are several ways to loaded shared libraries into the Python process. One
way is to instantiate one of the following classes:
CDLL(name, mode=DEFAULT_MODE, handle=None, use_errno=False, use_last_error=False)~
Instances of this class represent loaded shared libraries. Functions in these
libraries use the standard C calling convention, and are assumed to return
int.
OleDLL(name, mode=DEFAULT_MODE, handle=None, use_errno=False, use_last_error=False)~
Windows only: Instances of this class represent loaded shared libraries,
functions in these libraries use the ``stdcall`` calling convention, and are
assumed to return the windows specific HRESULT code. HRESULT
values contain information specifying whether the function call failed or
succeeded, together with additional error code. If the return value signals a
failure, an WindowsError is automatically raised.
WinDLL(name, mode=DEFAULT_MODE, handle=None, use_errno=False, use_last_error=False)~
Windows only: Instances of this class represent loaded shared libraries,
functions in these libraries use the ``stdcall`` calling convention, and are
assumed to return int by default.
On Windows CE only the standard calling convention is used, for convenience the
WinDLL and OleDLL use the standard calling convention on this
platform.
The Python global interpreter lock is released before calling any
function exported by these libraries, and reacquired afterwards.
PyDLL(name, mode=DEFAULT_MODE, handle=None)~
Instances of this class behave like CDLL instances, except that the
Python GIL is {not} released during the function call, and after the function
execution the Python error flag is checked. If the error flag is set, a Python
exception is raised.
Thus, this is only useful to call Python C api functions directly.
All these classes can be instantiated by calling them with at least one
argument, the pathname of the shared library. If you have an existing handle to
an already loaded shared library, it can be passed as the ``handle`` named
parameter, otherwise the underlying platforms ``dlopen`` or ``LoadLibrary``
function is used to load the library into the process, and to get a handle to
it.
The {mode} parameter can be used to specify how the library is loaded. For
details, consult the dlopen(3) manpage, on Windows, {mode} is
ignored.
The {use_errno} parameter, when set to True, enables a ctypes mechanism that
allows to access the system errno (|py2stdlib-errno|) error number in a safe way.
variable; if you call foreign functions created with ``use_errno=True`` then the
errno (|py2stdlib-errno|) value before the function call is swapped with the ctypes private
copy, the same happens immediately after the function call.
The function ctypes.get_errno returns the value of the ctypes private
copy, and the function ctypes.set_errno changes the ctypes private copy
to a new value and returns the former value.
The {use_last_error} parameter, when set to True, enables the same mechanism for
the Windows error code which is managed by the GetLastError and
SetLastError Windows API functions; ctypes.get_last_error and
ctypes.set_last_error are used to request and change the ctypes private
copy of the windows error code.
.. versionadded:: 2.6
The {use_last_error} and {use_errno} optional parameters were added.
RTLD_GLOBAL~
Flag to use as {mode} parameter. On platforms where this flag is not available,
it is defined as the integer zero.
RTLD_LOCAL~
Flag to use as {mode} parameter. On platforms where this is not available, it
is the same as {RTLD_GLOBAL}.
DEFAULT_MODE~
The default mode which is used to load shared libraries. On OSX 10.3, this is
{RTLD_GLOBAL}, otherwise it is the same as {RTLD_LOCAL}.
Instances of these classes have no public methods, however __getattr__
and __getitem__ have special behavior: functions exported by the shared
library can be accessed as attributes of by index. Please note that both
__getattr__ and __getitem__ cache their result, so calling them
repeatedly returns the same object each time.
The following public attributes are available, their name starts with an
underscore to not clash with exported function names:
PyDLL._handle~
The system handle used to access the library.
PyDLL._name~
The name of the library passed in the constructor.
Shared libraries can also be loaded by using one of the prefabricated objects,
which are instances of the LibraryLoader class, either by calling the
LoadLibrary method, or by retrieving the library as attribute of the
loader instance.
LibraryLoader(dlltype)~
Class which loads shared libraries. {dlltype} should be one of the
CDLL, PyDLL, WinDLL, or OleDLL types.
__getattr__ has special behavior: It allows to load a shared library by
accessing it as attribute of a library loader instance. The result is cached,
so repeated attribute accesses return the same library each time.
LoadLibrary(name)~
Load a shared library into the process and return it. This method always
returns a new instance of the library.
These prefabricated library loaders are available:
cdll~
Creates CDLL instances.
windll~
Windows only: Creates WinDLL instances.
oledll~
Windows only: Creates OleDLL instances.
pydll~
Creates PyDLL instances.
For accessing the C Python api directly, a ready-to-use Python shared library
object is available:
pythonapi~
An instance of PyDLL that exposes Python C API functions as
attributes. Note that all these functions are assumed to return C
int, which is of course not always the truth, so you have to assign
the correct restype attribute to use these functions.
Foreign functions
^^^^^^^^^^^^^^^^^
As explained in the previous section, foreign functions can be accessed as
attributes of loaded shared libraries. The function objects created in this way
by default accept any number of arguments, accept any ctypes data instances as
arguments, and return the default result type specified by the library loader.
They are instances of a private class:
_FuncPtr~
Base class for C callable foreign functions.
Instances of foreign functions are also C compatible data types; they
represent C function pointers.
This behavior can be customized by assigning to special attributes of the
foreign function object.
restype~
Assign a ctypes type to specify the result type of the foreign function.
Use ``None`` for void, a function not returning anything.
It is possible to assign a callable Python object that is not a ctypes
type, in this case the function is assumed to return a C int, and
the callable will be called with this integer, allowing to do further
processing or error checking. Using this is deprecated, for more flexible
post processing or error checking use a ctypes data type as
restype and assign a callable to the errcheck attribute.
argtypes~
Assign a tuple of ctypes types to specify the argument types that the
function accepts. Functions using the ``stdcall`` calling convention can
only be called with the same number of arguments as the length of this
tuple; functions using the C calling convention accept additional,
unspecified arguments as well.
When a foreign function is called, each actual argument is passed to the
from_param class method of the items in the argtypes
tuple, this method allows to adapt the actual argument to an object that
the foreign function accepts. For example, a c_char_p item in
the argtypes tuple will convert a unicode string passed as
argument into an byte string using ctypes conversion rules.
New: It is now possible to put items in argtypes which are not ctypes
types, but each item must have a from_param method which returns a
value usable as argument (integer, string, ctypes instance). This allows
to define adapters that can adapt custom objects as function parameters.
errcheck~
Assign a Python function or another callable to this attribute. The
callable will be called with three or more arguments:
.. function:: callable(result, func, arguments)
{result} is what the foreign function returns, as specified by the
restype attribute.
{func} is the foreign function object itself, this allows to reuse the
same callable object to check or post process the results of several
functions.
{arguments} is a tuple containing the parameters originally passed to
the function call, this allows to specialize the behavior on the
arguments used.
The object that this function returns will be returned from the
foreign function call, but it can also check the result value
and raise an exception if the foreign function call failed.
ArgumentError()~
This exception is raised when a foreign function call cannot convert one of the
passed arguments.
Function prototypes
^^^^^^^^^^^^^^^^^^^
Foreign functions can also be created by instantiating function prototypes.
Function prototypes are similar to function prototypes in C; they describe a
function (return type, argument types, calling convention) without defining an
implementation. The factory functions must be called with the desired result
type and the argument types of the function.
CFUNCTYPE(restype, *argtypes, use_errno=False, use_last_error=False)~
The returned function prototype creates functions that use the standard C
calling convention. The function will release the GIL during the call. If
{use_errno} is set to True, the ctypes private copy of the system
errno (|py2stdlib-errno|) variable is exchanged with the real errno (|py2stdlib-errno|) value before
and after the call; {use_last_error} does the same for the Windows error
code.
.. versionchanged:: 2.6
The optional {use_errno} and {use_last_error} parameters were added.
WINFUNCTYPE(restype, *argtypes, use_errno=False, use_last_error=False)~
Windows only: The returned function prototype creates functions that use the
``stdcall`` calling convention, except on Windows CE where
WINFUNCTYPE is the same as CFUNCTYPE. The function will
release the GIL during the call. {use_errno} and {use_last_error} have the
same meaning as above.
PYFUNCTYPE(restype, *argtypes)~
The returned function prototype creates functions that use the Python calling
convention. The function will {not} release the GIL during the call.
Function prototypes created by these factory functions can be instantiated in
different ways, depending on the type and number of the parameters in the call:
.. function:: prototype(address)
:module:
Returns a foreign function at the specified address which must be an integer.
.. function:: prototype(callable)
:module:
Create a C callable function (a callback function) from a Python {callable}.
.. function:: prototype(func_spec[, paramflags])
:module:
Returns a foreign function exported by a shared library. {func_spec} must be a
2-tuple ``(name_or_ordinal, library)``. The first item is the name of the
exported function as string, or the ordinal of the exported function as small
integer. The second item is the shared library instance.
.. function:: prototype(vtbl_index, name[, paramflags[, iid]])
:module:
Returns a foreign function that will call a COM method. {vtbl_index} is the
index into the virtual function table, a small non-negative integer. {name} is
name of the COM method. {iid} is an optional pointer to the interface identifier
which is used in extended error reporting.
COM methods use a special calling convention: They require a pointer to the COM
interface as first argument, in addition to those parameters that are specified
in the argtypes tuple.
The optional {paramflags} parameter creates foreign function wrappers with much
more functionality than the features described above.
{paramflags} must be a tuple of the same length as argtypes.
Each item in this tuple contains further information about a parameter, it must
be a tuple containing one, two, or three items.
The first item is an integer containing a combination of direction
flags for the parameter:
1
Specifies an input parameter to the function.
2
Output parameter. The foreign function fills in a value.
4
Input parameter which defaults to the integer zero.
The optional second item is the parameter name as string. If this is specified,
the foreign function can be called with named parameters.
The optional third item is the default value for this parameter.
This example demonstrates how to wrap the Windows ``MessageBoxA`` function so
that it supports default parameters and named arguments. The C declaration from
the windows header file is this:: >
WINUSERAPI int WINAPI
MessageBoxA(
HWND hWnd ,
LPCSTR lpText,
LPCSTR lpCaption,
UINT uType);
<
Here is the wrapping with ctypes (|py2stdlib-ctypes|)::
>>> from ctypes import c_int, WINFUNCTYPE, windll
>>> from ctypes.wintypes import HWND, LPCSTR, UINT
>>> prototype = WINFUNCTYPE(c_int, HWND, LPCSTR, LPCSTR, UINT)
>>> paramflags = (1, "hwnd", 0), (1, "text", "Hi"), (1, "caption", None), (1, "flags", 0)
>>> MessageBox = prototype(("MessageBoxA", windll.user32), paramflags)
>>>
The MessageBox foreign function can now be called in these ways:: >
>>> MessageBox()
>>> MessageBox(text="Spam, spam, spam")
>>> MessageBox(flags=2, text="foo bar")
>>>
<
A second example demonstrates output parameters. The win32 ``GetWindowRect``
function retrieves the dimensions of a specified window by copying them into
``RECT`` structure that the caller has to supply. Here is the C declaration:: >
WINUSERAPI BOOL WINAPI
GetWindowRect(
HWND hWnd,
LPRECT lpRect);
<
Here is the wrapping with ctypes (|py2stdlib-ctypes|)::
>>> from ctypes import POINTER, WINFUNCTYPE, windll, WinError
>>> from ctypes.wintypes import BOOL, HWND, RECT
>>> prototype = WINFUNCTYPE(BOOL, HWND, POINTER(RECT))
>>> paramflags = (1, "hwnd"), (2, "lprect")
>>> GetWindowRect = prototype(("GetWindowRect", windll.user32), paramflags)
>>>
Functions with output parameters will automatically return the output parameter
value if there is a single one, or a tuple containing the output parameter
values when there are more than one, so the GetWindowRect function now returns a
RECT instance, when called.
Output parameters can be combined with the errcheck protocol to do
further output processing and error checking. The win32 ``GetWindowRect`` api
function returns a ``BOOL`` to signal success or failure, so this function could
do the error checking, and raises an exception when the api call failed:: >
>>> def errcheck(result, func, args):
... if not result:
... raise WinError()
... return args
...
>>> GetWindowRect.errcheck = errcheck
>>>
<
If the errcheck function returns the argument tuple it receives
unchanged, ctypes (|py2stdlib-ctypes|) continues the normal processing it does on the output
parameters. If you want to return a tuple of window coordinates instead of a
``RECT`` instance, you can retrieve the fields in the function and return them
instead, the normal processing will no longer take place:: >
>>> def errcheck(result, func, args):
... if not result:
... raise WinError()
... rc = args[1]
... return rc.left, rc.top, rc.bottom, rc.right
...
>>> GetWindowRect.errcheck = errcheck
>>>
<
Utility functions
addressof(obj)~
Returns the address of the memory buffer as integer. {obj} must be an
instance of a ctypes type.
alignment(obj_or_type)~
Returns the alignment requirements of a ctypes type. {obj_or_type} must be a
ctypes type or instance.
byref(obj[, offset])~
Returns a light-weight pointer to {obj}, which must be an instance of a
ctypes type. {offset} defaults to zero, and must be an integer that will be
added to the internal pointer value.
``byref(obj, offset)`` corresponds to this C code:: >
(((char *)&obj) + offset)
<
The returned object can only be used as a foreign function call
parameter. It behaves similar to ``pointer(obj)``, but the
construction is a lot faster.
.. versionadded:: 2.6
The {offset} optional argument was added.
cast(obj, type)~
This function is similar to the cast operator in C. It returns a new
instance of {type} which points to the same memory block as {obj}. {type}
must be a pointer type, and {obj} must be an object that can be interpreted
as a pointer.
create_string_buffer(init_or_size[, size])~
This function creates a mutable character buffer. The returned object is a
ctypes array of c_char.
{init_or_size} must be an integer which specifies the size of the array, or a
string which will be used to initialize the array items.
If a string is specified as first argument, the buffer is made one item larger
than the length of the string so that the last element in the array is a NUL
termination character. An integer can be passed as second argument which allows
to specify the size of the array if the length of the string should not be used.
If the first parameter is a unicode string, it is converted into an 8-bit string
according to ctypes conversion rules.
create_unicode_buffer(init_or_size[, size])~
This function creates a mutable unicode character buffer. The returned object is
a ctypes array of c_wchar.
{init_or_size} must be an integer which specifies the size of the array, or a
unicode string which will be used to initialize the array items.
If a unicode string is specified as first argument, the buffer is made one item
larger than the length of the string so that the last element in the array is a
NUL termination character. An integer can be passed as second argument which
allows to specify the size of the array if the length of the string should not
be used.
If the first parameter is a 8-bit string, it is converted into an unicode string
according to ctypes conversion rules.
DllCanUnloadNow()~
Windows only: This function is a hook which allows to implement in-process
COM servers with ctypes. It is called from the DllCanUnloadNow function that
the _ctypes extension dll exports.
DllGetClassObject()~
Windows only: This function is a hook which allows to implement in-process
COM servers with ctypes. It is called from the DllGetClassObject function
that the ``_ctypes`` extension dll exports.
find_library(name)~
:module: ctypes.util
Try to find a library and return a pathname. {name} is the library name
without any prefix like ``lib``, suffix like ``.so``, ``.dylib`` or version
number (this is the form used for the posix linker option -l). If
no library can be found, returns ``None``.
The exact functionality is system dependent.
.. versionchanged:: 2.6
Windows only: ``find_library("m")`` or ``find_library("c")`` return the
result of a call to ``find_msvcrt()``.
find_msvcrt()~
:module: ctypes.util
Windows only: return the filename of the VC runtype library used by Python,
and by the extension modules. If the name of the library cannot be
determined, ``None`` is returned.
If you need to free memory, for example, allocated by an extension module
with a call to the ``free(void *)``, it is important that you use the
function in the same library that allocated the memory.
.. versionadded:: 2.6
FormatError([code])~
Windows only: Returns a textual description of the error code {code}. If no
error code is specified, the last error code is used by calling the Windows
api function GetLastError.
GetLastError()~
Windows only: Returns the last error code set by Windows in the calling thread.
This function calls the Windows `GetLastError()` function directly,
it does not return the ctypes-private copy of the error code.
get_errno()~
Returns the current value of the ctypes-private copy of the system
errno (|py2stdlib-errno|) variable in the calling thread.
.. versionadded:: 2.6
get_last_error()~
Windows only: returns the current value of the ctypes-private copy of the system
LastError variable in the calling thread.
.. versionadded:: 2.6
memmove(dst, src, count)~
Same as the standard C memmove library function: copies {count} bytes from
{src} to {dst}. {dst} and {src} must be integers or ctypes instances that can
be converted to pointers.
memset(dst, c, count)~
Same as the standard C memset library function: fills the memory block at
address {dst} with {count} bytes of value {c}. {dst} must be an integer
specifying an address, or a ctypes instance.
POINTER(type)~
This factory function creates and returns a new ctypes pointer type. Pointer
types are cached an reused internally, so calling this function repeatedly is
cheap. {type} must be a ctypes type.
pointer(obj)~
This function creates a new pointer instance, pointing to {obj}. The returned
object is of the type ``POINTER(type(obj))``.
Note: If you just want to pass a pointer to an object to a foreign function
call, you should use ``byref(obj)`` which is much faster.
resize(obj, size)~
This function resizes the internal memory buffer of {obj}, which must be an
instance of a ctypes type. It is not possible to make the buffer smaller
than the native size of the objects type, as given by ``sizeof(type(obj))``,
but it is possible to enlarge the buffer.
set_conversion_mode(encoding, errors)~
This function sets the rules that ctypes objects use when converting between
8-bit strings and unicode strings. {encoding} must be a string specifying an
encoding, like ``'utf-8'`` or ``'mbcs'``, {errors} must be a string
specifying the error handling on encoding/decoding errors. Examples of
possible values are ``"strict"``, ``"replace"``, or ``"ignore"``.
set_conversion_mode returns a 2-tuple containing the previous
conversion rules. On windows, the initial conversion rules are ``('mbcs',
'ignore')``, on other systems ``('ascii', 'strict')``.
set_errno(value)~
Set the current value of the ctypes-private copy of the system errno (|py2stdlib-errno|)
variable in the calling thread to {value} and return the previous value.
.. versionadded:: 2.6
set_last_error(value)~
Windows only: set the current value of the ctypes-private copy of the system
LastError variable in the calling thread to {value} and return the
previous value.
.. versionadded:: 2.6
sizeof(obj_or_type)~
Returns the size in bytes of a ctypes type or instance memory buffer. Does the
same as the C ``sizeof()`` function.
string_at(address[, size])~
This function returns the string starting at memory address address. If size
is specified, it is used as size, otherwise the string is assumed to be
zero-terminated.
WinError(code=None, descr=None)~
Windows only: this function is probably the worst-named thing in ctypes. It
creates an instance of WindowsError. If {code} is not specified,
``GetLastError`` is called to determine the error code. If ``descr`` is not
specified, FormatError is called to get a textual description of the
error.
wstring_at(address[, size])~
This function returns the wide character string starting at memory address
{address} as unicode string. If {size} is specified, it is used as the
number of characters of the string, otherwise the string is assumed to be
zero-terminated.
Data types
^^^^^^^^^^
_CData~
This non-public class is the common base class of all ctypes data types.
Among other things, all ctypes type instances contain a memory block that
hold C compatible data; the address of the memory block is returned by the
addressof helper function. Another instance variable is exposed as
_objects; this contains other Python objects that need to be kept
alive in case the memory block contains pointers.
Common methods of ctypes data types, these are all class methods (to be
exact, they are methods of the metaclass):
_CData.from_buffer(source[, offset])~
This method returns a ctypes instance that shares the buffer of the
{source} object. The {source} object must support the writeable buffer
interface. The optional {offset} parameter specifies an offset into the
source buffer in bytes; the default is zero. If the source buffer is not
large enough a ValueError is raised.
.. versionadded:: 2.6
_CData.from_buffer_copy(source[, offset])~
This method creates a ctypes instance, copying the buffer from the
{source} object buffer which must be readable. The optional {offset}
parameter specifies an offset into the source buffer in bytes; the default
is zero. If the source buffer is not large enough a ValueError is
raised.
.. versionadded:: 2.6
from_address(address)~
This method returns a ctypes type instance using the memory specified by
{address} which must be an integer.
from_param(obj)~
This method adapts {obj} to a ctypes type. It is called with the actual
object used in a foreign function call when the type is present in the
foreign function's argtypes tuple; it must return an object that
can be used as a function call parameter.
All ctypes data types have a default implementation of this classmethod
that normally returns {obj} if that is an instance of the type. Some
types accept other objects as well.
in_dll(library, name)~
This method returns a ctypes type instance exported by a shared
library. {name} is the name of the symbol that exports the data, {library}
is the loaded shared library.
Common instance variables of ctypes data types:
_b_base_~
Sometimes ctypes data instances do not own the memory block they contain,
instead they share part of the memory block of a base object. The
_b_base_ read-only member is the root ctypes object that owns the
memory block.
_b_needsfree_~
This read-only variable is true when the ctypes data instance has
allocated the memory block itself, false otherwise.
_objects~
This member is either ``None`` or a dictionary containing Python objects
that need to be kept alive so that the memory block contents is kept
valid. This object is only exposed for debugging; never modify the
contents of this dictionary.
Fundamental data types
^^^^^^^^^^^^^^^^^^^^^^
_SimpleCData~
This non-public class is the base class of all fundamental ctypes data
types. It is mentioned here because it contains the common attributes of the
fundamental ctypes data types. _SimpleCData is a subclass of
_CData, so it inherits their methods and attributes.
.. versionchanged:: 2.6
ctypes data types that are not and do not contain pointers can now be
pickled.
Instances have a single attribute:
value~
This attribute contains the actual value of the instance. For integer and
pointer types, it is an integer, for character types, it is a single
character string, for character pointer types it is a Python string or
unicode string.
When the ``value`` attribute is retrieved from a ctypes instance, usually
a new object is returned each time. ctypes (|py2stdlib-ctypes|) does {not} implement
original object return, always a new object is constructed. The same is
true for all other ctypes object instances.
Fundamental data types, when returned as foreign function call results, or, for
example, by retrieving structure field members or array items, are transparently
converted to native Python types. In other words, if a foreign function has a
restype of c_char_p, you will always receive a Python string,
{not} a c_char_p instance.
Subclasses of fundamental data types do {not} inherit this behavior. So, if a
foreign functions restype is a subclass of c_void_p, you will
receive an instance of this subclass from the function call. Of course, you can
get the value of the pointer by accessing the ``value`` attribute.
These are the fundamental ctypes data types:
c_byte~
Represents the C signed char datatype, and interprets the value as
small integer. The constructor accepts an optional integer initializer; no
overflow checking is done.
c_char~
Represents the C char datatype, and interprets the value as a single
character. The constructor accepts an optional string initializer, the
length of the string must be exactly one character.
c_char_p~
Represents the C char * datatype when it points to a zero-terminated
string. For a general character pointer that may also point to binary data,
``POINTER(c_char)`` must be used. The constructor accepts an integer
address, or a string.
c_double~
Represents the C double datatype. The constructor accepts an
optional float initializer.
c_longdouble~
Represents the C long double datatype. The constructor accepts an
optional float initializer. On platforms where ``sizeof(long double) ==
sizeof(double)`` it is an alias to c_double.
.. versionadded:: 2.6
c_float~
Represents the C float datatype. The constructor accepts an
optional float initializer.
c_int~
Represents the C signed int datatype. The constructor accepts an
optional integer initializer; no overflow checking is done. On platforms
where ``sizeof(int) == sizeof(long)`` it is an alias to c_long.
c_int8~
Represents the C 8-bit signed int datatype. Usually an alias for
c_byte.
c_int16~
Represents the C 16-bit signed int datatype. Usually an alias for
c_short.
c_int32~
Represents the C 32-bit signed int datatype. Usually an alias for
c_int.
c_int64~
Represents the C 64-bit signed int datatype. Usually an alias for
c_longlong.
c_long~
Represents the C signed long datatype. The constructor accepts an
optional integer initializer; no overflow checking is done.
c_longlong~
Represents the C signed long long datatype. The constructor accepts
an optional integer initializer; no overflow checking is done.
c_short~
Represents the C signed short datatype. The constructor accepts an
optional integer initializer; no overflow checking is done.
c_size_t~
Represents the C size_t datatype.
c_ssize_t~
Represents the C ssize_t datatype.
.. versionadded:: 2.7
c_ubyte~
Represents the C unsigned char datatype, it interprets the value as
small integer. The constructor accepts an optional integer initializer; no
overflow checking is done.
c_uint~
Represents the C unsigned int datatype. The constructor accepts an
optional integer initializer; no overflow checking is done. On platforms
where ``sizeof(int) == sizeof(long)`` it is an alias for c_ulong.
c_uint8~
Represents the C 8-bit unsigned int datatype. Usually an alias for
c_ubyte.
c_uint16~
Represents the C 16-bit unsigned int datatype. Usually an alias for
c_ushort.
c_uint32~
Represents the C 32-bit unsigned int datatype. Usually an alias for
c_uint.
c_uint64~
Represents the C 64-bit unsigned int datatype. Usually an alias for
c_ulonglong.
c_ulong~
Represents the C unsigned long datatype. The constructor accepts an
optional integer initializer; no overflow checking is done.
c_ulonglong~
Represents the C unsigned long long datatype. The constructor
accepts an optional integer initializer; no overflow checking is done.
c_ushort~
Represents the C unsigned short datatype. The constructor accepts
an optional integer initializer; no overflow checking is done.
c_void_p~
Represents the C void * type. The value is represented as integer.
The constructor accepts an optional integer initializer.
c_wchar~
Represents the C wchar_t datatype, and interprets the value as a
single character unicode string. The constructor accepts an optional string
initializer, the length of the string must be exactly one character.
c_wchar_p~
Represents the C wchar_t * datatype, which must be a pointer to a
zero-terminated wide character string. The constructor accepts an integer
address, or a string.
c_bool~
Represent the C bool datatype (more accurately, _Bool from
C99). Its value can be True or False, and the constructor accepts any object
that has a truth value.
.. versionadded:: 2.6
HRESULT~
Windows only: Represents a HRESULT value, which contains success or
error information for a function or method call.
py_object~
Represents the C PyObject * datatype. Calling this without an
argument creates a ``NULL`` PyObject * pointer.
The ctypes.wintypes module provides quite some other Windows specific
data types, for example HWND, WPARAM, or DWORD. Some
useful structures like MSG or RECT are also defined.
Structured data types
^^^^^^^^^^^^^^^^^^^^^
Union({args, }*kw)~
Abstract base class for unions in native byte order.
BigEndianStructure({args, }*kw)~
Abstract base class for structures in {big endian} byte order.
LittleEndianStructure({args, }*kw)~
Abstract base class for structures in {little endian} byte order.
Structures with non-native byte order cannot contain pointer type fields, or any
other data types containing pointer type fields.
Structure({args, }*kw)~
Abstract base class for structures in {native} byte order.
Concrete structure and union types must be created by subclassing one of these
types, and at least define a _fields_ class variable. ctypes (|py2stdlib-ctypes|) will
create descriptor\s which allow reading and writing the fields by direct
attribute accesses. These are the
_fields_~
A sequence defining the structure fields. The items must be 2-tuples or
3-tuples. The first item is the name of the field, the second item
specifies the type of the field; it can be any ctypes data type.
For integer type fields like c_int, a third optional item can be
given. It must be a small positive integer defining the bit width of the
field.
Field names must be unique within one structure or union. This is not
checked, only one field can be accessed when names are repeated.
It is possible to define the _fields_ class variable {after} the
class statement that defines the Structure subclass, this allows to create
data types that directly or indirectly reference themselves:: >
class List(Structure):
pass
List._fields_ = [("pnext", POINTER(List)),
...
]
<
The _fields_ class variable must, however, be defined before the
type is first used (an instance is created, ``sizeof()`` is called on it,
and so on). Later assignments to the _fields_ class variable will
raise an AttributeError.
Structure and union subclass constructors accept both positional and named
arguments. Positional arguments are used to initialize the fields in the
same order as they appear in the _fields_ definition, named
arguments are used to initialize the fields with the corresponding name.
It is possible to defined sub-subclasses of structure types, they inherit
the fields of the base class plus the _fields_ defined in the
sub-subclass, if any.
_pack_~
An optional small integer that allows to override the alignment of
structure fields in the instance. _pack_ must already be defined
when _fields_ is assigned, otherwise it will have no effect.
_anonymous_~
An optional sequence that lists the names of unnamed (anonymous) fields.
_anonymous_ must be already defined when _fields_ is
assigned, otherwise it will have no effect.
The fields listed in this variable must be structure or union type fields.
ctypes (|py2stdlib-ctypes|) will create descriptors in the structure type that allows to
access the nested fields directly, without the need to create the
structure or union field.
Here is an example type (Windows):: >
class _U(Union):
_fields_ = [("lptdesc", POINTER(TYPEDESC)),
("lpadesc", POINTER(ARRAYDESC)),
("hreftype", HREFTYPE)]
class TYPEDESC(Structure):
_anonymous_ = ("u",)
_fields_ = [("u", _U),
("vt", VARTYPE)]
<
The ``TYPEDESC`` structure describes a COM data type, the ``vt`` field
specifies which one of the union fields is valid. Since the ``u`` field
is defined as anonymous field, it is now possible to access the members
directly off the TYPEDESC instance. ``td.lptdesc`` and ``td.u.lptdesc``
are equivalent, but the former is faster since it does not need to create
a temporary union instance:: >
td = TYPEDESC()
td.vt = VT_PTR
td.lptdesc = POINTER(some_type)
td.u.lptdesc = POINTER(some_type)
<
It is possible to defined sub-subclasses of structures, they inherit the
fields of the base class. If the subclass definition has a separate
_fields_ variable, the fields specified in this are appended to the
fields of the base class.
Structure and union constructors accept both positional and keyword
arguments. Positional arguments are used to initialize member fields in the
same order as they are appear in _fields_. Keyword arguments in the
constructor are interpreted as attribute assignments, so they will initialize
_fields_ with the same name, or create new attributes for names not
present in _fields_.
Arrays and pointers
^^^^^^^^^^^^^^^^^^^
Not yet written - please see the sections ctypes-pointers and section
ctypes-arrays in the tutorial.
==============================================================================
*py2stdlib-curses.ascii*
curses.ascii~
:synopsis: Constants and set-membership functions for ASCII characters.
.. versionadded:: 1.6
The curses.ascii (|py2stdlib-curses.ascii|) module supplies name constants for ASCII characters and
functions to test membership in various ASCII character classes. The constants
supplied are names for control characters as follows:
+--------------+----------------------------------------------+
| Name | Meaning |
+==============+==============================================+
| NUL | |
+--------------+----------------------------------------------+
| SOH | Start of heading, console interrupt |
+--------------+----------------------------------------------+
| STX | Start of text |
+--------------+----------------------------------------------+
| ETX | End of text |
+--------------+----------------------------------------------+
| EOT | End of transmission |
+--------------+----------------------------------------------+
| ENQ | Enquiry, goes with ACK flow control |
+--------------+----------------------------------------------+
| ACK | Acknowledgement |
+--------------+----------------------------------------------+
| BEL | Bell |
+--------------+----------------------------------------------+
| BS | Backspace |
+--------------+----------------------------------------------+
| TAB | Tab |
+--------------+----------------------------------------------+
| HT | Alias for TAB: "Horizontal tab" |
+--------------+----------------------------------------------+
| LF | Line feed |
+--------------+----------------------------------------------+
| NL | Alias for LF: "New line" |
+--------------+----------------------------------------------+
| VT | Vertical tab |
+--------------+----------------------------------------------+
| FF | Form feed |
+--------------+----------------------------------------------+
| CR | Carriage return |
+--------------+----------------------------------------------+
| SO | Shift-out, begin alternate character set |
+--------------+----------------------------------------------+
| SI | Shift-in, resume default character set |
+--------------+----------------------------------------------+
| DLE | Data-link escape |
+--------------+----------------------------------------------+
| DC1 | XON, for flow control |
+--------------+----------------------------------------------+
| DC2 | Device control 2, block-mode flow control |
+--------------+----------------------------------------------+
| DC3 | XOFF, for flow control |
+--------------+----------------------------------------------+
| DC4 | Device control 4 |
+--------------+----------------------------------------------+
| NAK | Negative acknowledgement |
+--------------+----------------------------------------------+
| SYN | Synchronous idle |
+--------------+----------------------------------------------+
| ETB | End transmission block |
+--------------+----------------------------------------------+
| CAN | Cancel |
+--------------+----------------------------------------------+
| EM | End of medium |
+--------------+----------------------------------------------+
| SUB | Substitute |
+--------------+----------------------------------------------+
| ESC | Escape |
+--------------+----------------------------------------------+
| FS | File separator |
+--------------+----------------------------------------------+
| GS | Group separator |
+--------------+----------------------------------------------+
| RS | Record separator, block-mode terminator |
+--------------+----------------------------------------------+
| US | Unit separator |
+--------------+----------------------------------------------+
| SP | Space |
+--------------+----------------------------------------------+
| DEL | Delete |
+--------------+----------------------------------------------+
Note that many of these have little practical significance in modern usage. The
mnemonics derive from teleprinter conventions that predate digital computers.
The module supplies the following functions, patterned on those in the standard
C library:
isalnum(c)~
Checks for an ASCII alphanumeric character; it is equivalent to ``isalpha(c) or
isdigit(c)``.
isalpha(c)~
Checks for an ASCII alphabetic character; it is equivalent to ``isupper(c) or
islower(c)``.
isascii(c)~
Checks for a character value that fits in the 7-bit ASCII set.
isblank(c)~
Checks for an ASCII whitespace character.
iscntrl(c)~
Checks for an ASCII control character (in the range 0x00 to 0x1f).
isdigit(c)~
Checks for an ASCII decimal digit, ``'0'`` through ``'9'``. This is equivalent
to ``c in string.digits``.
isgraph(c)~
Checks for ASCII any printable character except space.
islower(c)~
Checks for an ASCII lower-case character.
isprint(c)~
Checks for any ASCII printable character including space.
ispunct(c)~
Checks for any printable ASCII character which is not a space or an alphanumeric
character.
isspace(c)~
Checks for ASCII white-space characters; space, line feed, carriage return, form
feed, horizontal tab, vertical tab.
isupper(c)~
Checks for an ASCII uppercase letter.
isxdigit(c)~
Checks for an ASCII hexadecimal digit. This is equivalent to ``c in
string.hexdigits``.
isctrl(c)~
Checks for an ASCII control character (ordinal values 0 to 31).
ismeta(c)~
Checks for a non-ASCII character (ordinal values 0x80 and above).
These functions accept either integers or strings; when the argument is a
string, it is first converted using the built-in function ord.
Note that all these functions check ordinal bit values derived from the first
character of the string you pass in; they do not actually know anything about
the host machine's character encoding. For functions that know about the
character encoding (and handle internationalization properly) see the
string (|py2stdlib-string|) module.
The following two functions take either a single-character string or integer
byte value; they return a value of the same type.
ascii(c)~
Return the ASCII value corresponding to the low 7 bits of {c}.
ctrl(c)~
Return the control character corresponding to the given character (the character
bit value is bitwise-anded with 0x1f).
alt(c)~
Return the 8-bit character corresponding to the given ASCII character (the
character bit value is bitwise-ored with 0x80).
The following function takes either a single-character string or integer value;
it returns a string.
unctrl(c)~
Return a string representation of the ASCII character {c}. If {c} is printable,
this string is the character itself. If the character is a control character
(0x00-0x1f) the string consists of a caret (``'^'``) followed by the
corresponding uppercase letter. If the character is an ASCII delete (0x7f) the
string is ``'^?'``. If the character has its meta bit (0x80) set, the meta bit
is stripped, the preceding rules applied, and ``'!'`` prepended to the result.
controlnames~
A 33-element string array that contains the ASCII mnemonics for the thirty-two
ASCII control characters from 0 (NUL) to 0x1f (US), in order, plus the mnemonic
``SP`` for the space character.
==============================================================================
*py2stdlib-curses.panel*
curses.panel~
:synopsis: A panel stack extension that adds depth to curses windows.
Panels are windows with the added feature of depth, so they can be stacked on
top of each other, and only the visible portions of each window will be
displayed. Panels can be added, moved up or down in the stack, and removed.
Functions
---------
The module curses.panel (|py2stdlib-curses.panel|) defines the following functions:
bottom_panel()~
Returns the bottom panel in the panel stack.
new_panel(win)~
Returns a panel object, associating it with the given window {win}. Be aware
that you need to keep the returned panel object referenced explicitly. If you
don't, the panel object is garbage collected and removed from the panel stack.
top_panel()~
Returns the top panel in the panel stack.
update_panels()~
Updates the virtual screen after changes in the panel stack. This does not call
curses.doupdate, so you'll have to do this yourself.
Panel Objects
-------------
Panel objects, as returned by new_panel above, are windows with a
stacking order. There's always a window associated with a panel which determines
the content, while the panel methods are responsible for the window's depth in
the panel stack.
Panel objects have the following methods:
Panel.above()~
Returns the panel above the current panel.
Panel.below()~
Returns the panel below the current panel.
Panel.bottom()~
Push the panel to the bottom of the stack.
Panel.hidden()~
Returns true if the panel is hidden (not visible), false otherwise.
Panel.hide()~
Hide the panel. This does not delete the object, it just makes the window on
screen invisible.
Panel.move(y, x)~
Move the panel to the screen coordinates ``(y, x)``.
Panel.replace(win)~
Change the window associated with the panel to the window {win}.
Panel.set_userptr(obj)~
Set the panel's user pointer to {obj}. This is used to associate an arbitrary
piece of data with the panel, and can be any Python object.
Panel.show()~
Display the panel (which might have been hidden).
Panel.top()~
Push panel to the top of the stack.
Panel.userptr()~
Returns the user pointer for the panel. This might be any Python object.
Panel.window()~
Returns the window object associated with the panel.
==============================================================================
*py2stdlib-curses*
curses~
:synopsis: An interface to the curses library, providing portable terminal
handling.
:platform: Unix
.. versionchanged:: 1.6
Added support for the ``ncurses`` library and converted to a package.
The curses (|py2stdlib-curses|) module provides an interface to the curses library, the
de-facto standard for portable advanced terminal handling.
While curses is most widely used in the Unix environment, versions are available
for DOS, OS/2, and possibly other systems as well. This extension module is
designed to match the API of ncurses, an open-source curses library hosted on
Linux and the BSD variants of Unix.
.. note::
Since version 5.4, the ncurses library decides how to interpret non-ASCII data
using the ``nl_langinfo`` function. That means that you have to call
locale.setlocale in the application and encode Unicode strings
using one of the system's available encodings. This example uses the
system's default encoding:: >
import locale
locale.setlocale(locale.LC_ALL, '')
code = locale.getpreferredencoding()
<
Then use {code} as the encoding for str.encode calls.
.. seealso::
Module curses.ascii (|py2stdlib-curses.ascii|)
Utilities for working with ASCII characters, regardless of your locale settings.
Module curses.panel (|py2stdlib-curses.panel|)
A panel stack extension that adds depth to curses windows.
Module curses.textpad (|py2stdlib-curses.textpad|)
Editable text widget for curses supporting Emacs\ -like bindings.
Module curses.wrapper (|py2stdlib-curses.wrapper|)
Convenience function to ensure proper terminal setup and resetting on
application entry and exit.
curses-howto
Tutorial material on using curses with Python, by Andrew Kuchling and Eric
Raymond.
The Demo/curses/ directory in the Python source distribution contains
some example programs using the curses bindings provided by this module.
Functions
---------
The module curses (|py2stdlib-curses|) defines the following exception:
error~
Exception raised when a curses library function returns an error.
.. note::
Whenever {x} or {y} arguments to a function or a method are optional, they
default to the current cursor location. Whenever {attr} is optional, it defaults
to A_NORMAL.
The module curses (|py2stdlib-curses|) defines the following functions:
baudrate()~
Returns the output speed of the terminal in bits per second. On software
terminal emulators it will have a fixed high value. Included for historical
reasons; in former times, it was used to write output loops for time delays and
occasionally to change interfaces depending on the line speed.
beep()~
Emit a short attention sound.
can_change_color()~
Returns true or false, depending on whether the programmer can change the colors
displayed by the terminal.
cbreak()~
Enter cbreak mode. In cbreak mode (sometimes called "rare" mode) normal tty
line buffering is turned off and characters are available to be read one by one.
However, unlike raw mode, special characters (interrupt, quit, suspend, and flow
control) retain their effects on the tty driver and calling program. Calling
first raw then cbreak leaves the terminal in cbreak mode.
color_content(color_number)~
Returns the intensity of the red, green, and blue (RGB) components in the color
{color_number}, which must be between ``0`` and COLORS. A 3-tuple is
returned, containing the R,G,B values for the given color, which will be between
``0`` (no component) and ``1000`` (maximum amount of component).
color_pair(color_number)~
Returns the attribute value for displaying text in the specified color. This
attribute value can be combined with A_STANDOUT, A_REVERSE,
and the other A_\* attributes. pair_number is the counterpart
to this function.
curs_set(visibility)~
Sets the cursor state. {visibility} can be set to 0, 1, or 2, for invisible,
normal, or very visible. If the terminal supports the visibility requested, the
previous cursor state is returned; otherwise, an exception is raised. On many
terminals, the "visible" mode is an underline cursor and the "very visible" mode
is a block cursor.
def_prog_mode()~
Saves the current terminal mode as the "program" mode, the mode when the running
program is using curses. (Its counterpart is the "shell" mode, for when the
program is not in curses.) Subsequent calls to reset_prog_mode will
restore this mode.
def_shell_mode()~
Saves the current terminal mode as the "shell" mode, the mode when the running
program is not using curses. (Its counterpart is the "program" mode, when the
program is using curses capabilities.) Subsequent calls to
reset_shell_mode will restore this mode.
delay_output(ms)~
Inserts an {ms} millisecond pause in output.
doupdate()~
Update the physical screen. The curses library keeps two data structures, one
representing the current physical screen contents and a virtual screen
representing the desired next state. The doupdate ground updates the
physical screen to match the virtual screen.
The virtual screen may be updated by a noutrefresh call after write
operations such as addstr have been performed on a window. The normal
refresh call is simply noutrefresh followed by doupdate;
if you have to update multiple windows, you can speed performance and perhaps
reduce screen flicker by issuing noutrefresh calls on all windows,
followed by a single doupdate.
echo()~
Enter echo mode. In echo mode, each character input is echoed to the screen as
it is entered.
endwin()~
De-initialize the library, and return terminal to normal status.
erasechar()~
Returns the user's current erase character. Under Unix operating systems this
is a property of the controlling tty of the curses program, and is not set by
the curses library itself.
filter()~
The .filter routine, if used, must be called before initscr is
called. The effect is that, during those calls, LINES is set to 1; the
capabilities clear, cup, cud, cud1, cuu1, cuu, vpa are disabled; and the home
string is set to the value of cr. The effect is that the cursor is confined to
the current line, and so are screen updates. This may be used for enabling
character-at-a-time line editing without touching the rest of the screen.
flash()~
Flash the screen. That is, change it to reverse-video and then change it back
in a short interval. Some people prefer such as 'visible bell' to the audible
attention signal produced by beep.
flushinp()~
Flush all input buffers. This throws away any typeahead that has been typed
by the user and has not yet been processed by the program.
getmouse()~
After getch returns KEY_MOUSE to signal a mouse event, this
method should be call to retrieve the queued mouse event, represented as a
5-tuple ``(id, x, y, z, bstate)``. {id} is an ID value used to distinguish
multiple devices, and {x}, {y}, {z} are the event's coordinates. ({z} is
currently unused.). {bstate} is an integer value whose bits will be set to
indicate the type of event, and will be the bitwise OR of one or more of the
following constants, where {n} is the button number from 1 to 4:
BUTTONn_PRESSED, BUTTONn_RELEASED, BUTTONn_CLICKED,
BUTTONn_DOUBLE_CLICKED, BUTTONn_TRIPLE_CLICKED,
BUTTON_SHIFT, BUTTON_CTRL, BUTTON_ALT.
getsyx()~
Returns the current coordinates of the virtual screen cursor in y and x. If
leaveok is currently true, then -1,-1 is returned.
getwin(file)~
Reads window related data stored in the file by an earlier putwin call.
The routine then creates and initializes a new window using that data, returning
the new window object.
has_colors()~
Returns true if the terminal can display colors; otherwise, it returns false.
has_ic()~
Returns true if the terminal has insert- and delete- character capabilities.
This function is included for historical reasons only, as all modern software
terminal emulators have such capabilities.
has_il()~
Returns true if the terminal has insert- and delete-line capabilities, or can
simulate them using scrolling regions. This function is included for
historical reasons only, as all modern software terminal emulators have such
capabilities.
has_key(ch)~
Takes a key value {ch}, and returns true if the current terminal type recognizes
a key with that value.
halfdelay(tenths)~
Used for half-delay mode, which is similar to cbreak mode in that characters
typed by the user are immediately available to the program. However, after
blocking for {tenths} tenths of seconds, an exception is raised if nothing has
been typed. The value of {tenths} must be a number between 1 and 255. Use
nocbreak to leave half-delay mode.
init_color(color_number, r, g, b)~
Changes the definition of a color, taking the number of the color to be changed
followed by three RGB values (for the amounts of red, green, and blue
components). The value of {color_number} must be between ``0`` and
COLORS. Each of {r}, {g}, {b}, must be a value between ``0`` and
``1000``. When init_color is used, all occurrences of that color on the
screen immediately change to the new definition. This function is a no-op on
most terminals; it is active only if can_change_color returns ``1``.
init_pair(pair_number, fg, bg)~
Changes the definition of a color-pair. It takes three arguments: the number of
the color-pair to be changed, the foreground color number, and the background
color number. The value of {pair_number} must be between ``1`` and
``COLOR_PAIRS - 1`` (the ``0`` color pair is wired to white on black and cannot
be changed). The value of {fg} and {bg} arguments must be between ``0`` and
COLORS. If the color-pair was previously initialized, the screen is
refreshed and all occurrences of that color-pair are changed to the new
definition.
initscr()~
Initialize the library. Returns a WindowObject which represents the
whole screen.
.. note:: >
If there is an error opening the terminal, the underlying curses library may
cause the interpreter to exit.
<
isendwin()~
Returns true if endwin has been called (that is, the curses library has
been deinitialized).
keyname(k)~
Return the name of the key numbered {k}. The name of a key generating printable
ASCII character is the key's character. The name of a control-key combination
is a two-character string consisting of a caret followed by the corresponding
printable ASCII character. The name of an alt-key combination (128-255) is a
string consisting of the prefix 'M-' followed by the name of the corresponding
ASCII character.
killchar()~
Returns the user's current line kill character. Under Unix operating systems
this is a property of the controlling tty of the curses program, and is not set
by the curses library itself.
longname()~
Returns a string containing the terminfo long name field describing the current
terminal. The maximum length of a verbose description is 128 characters. It is
defined only after the call to initscr.
meta(yes)~
If {yes} is 1, allow 8-bit characters to be input. If {yes} is 0, allow only
7-bit chars.
mouseinterval(interval)~
Sets the maximum time in milliseconds that can elapse between press and release
events in order for them to be recognized as a click, and returns the previous
interval value. The default value is 200 msec, or one fifth of a second.
mousemask(mousemask)~
Sets the mouse events to be reported, and returns a tuple ``(availmask,
oldmask)``. {availmask} indicates which of the specified mouse events can be
reported; on complete failure it returns 0. {oldmask} is the previous value of
the given window's mouse event mask. If this function is never called, no mouse
events are ever reported.
napms(ms)~
Sleep for {ms} milliseconds.
newpad(nlines, ncols)~
Creates and returns a pointer to a new pad data structure with the given number
of lines and columns. A pad is returned as a window object.
A pad is like a window, except that it is not restricted by the screen size, and
is not necessarily associated with a particular part of the screen. Pads can be
used when a large window is needed, and only a part of the window will be on the
screen at one time. Automatic refreshes of pads (such as from scrolling or
echoing of input) do not occur. The refresh and noutrefresh
methods of a pad require 6 arguments to specify the part of the pad to be
displayed and the location on the screen to be used for the display. The
arguments are pminrow, pmincol, sminrow, smincol, smaxrow, smaxcol; the p
arguments refer to the upper left corner of the pad region to be displayed and
the s arguments define a clipping box on the screen within which the pad region
is to be displayed.
newwin([nlines, ncols,] begin_y, begin_x)~
Return a new window, whose left-upper corner is at ``(begin_y, begin_x)``, and
whose height/width is {nlines}/{ncols}.
By default, the window will extend from the specified position to the lower
right corner of the screen.
nl()~
Enter newline mode. This mode translates the return key into newline on input,
and translates newline into return and line-feed on output. Newline mode is
initially on.
nocbreak()~
Leave cbreak mode. Return to normal "cooked" mode with line buffering.
noecho()~
Leave echo mode. Echoing of input characters is turned off.
nonl()~
Leave newline mode. Disable translation of return into newline on input, and
disable low-level translation of newline into newline/return on output (but this
does not change the behavior of ``addch('\n')``, which always does the
equivalent of return and line feed on the virtual screen). With translation
off, curses can sometimes speed up vertical motion a little; also, it will be
able to detect the return key on input.
noqiflush()~
When the noqiflush routine is used, normal flush of input and output queues
associated with the INTR, QUIT and SUSP characters will not be done. You may
want to call noqiflush in a signal handler if you want output to
continue as though the interrupt had not occurred, after the handler exits.
noraw()~
Leave raw mode. Return to normal "cooked" mode with line buffering.
pair_content(pair_number)~
Returns a tuple ``(fg, bg)`` containing the colors for the requested color pair.
The value of {pair_number} must be between ``1`` and ``COLOR_PAIRS - 1``.
pair_number(attr)~
Returns the number of the color-pair set by the attribute value {attr}.
color_pair is the counterpart to this function.
putp(string)~
Equivalent to ``tputs(str, 1, putchar)``; emits the value of a specified
terminfo capability for the current terminal. Note that the output of putp
always goes to standard output.
qiflush( [flag] )~
If {flag} is false, the effect is the same as calling noqiflush. If
{flag} is true, or no argument is provided, the queues will be flushed when
these control characters are read.
raw()~
Enter raw mode. In raw mode, normal line buffering and processing of
interrupt, quit, suspend, and flow control keys are turned off; characters are
presented to curses input functions one by one.
reset_prog_mode()~
Restores the terminal to "program" mode, as previously saved by
def_prog_mode.
reset_shell_mode()~
Restores the terminal to "shell" mode, as previously saved by
def_shell_mode.
setsyx(y, x)~
Sets the virtual screen cursor to {y}, {x}. If {y} and {x} are both -1, then
leaveok is set.
setupterm([termstr, fd])~
Initializes the terminal. {termstr} is a string giving the terminal name; if
omitted, the value of the TERM environment variable will be used. {fd} is the
file descriptor to which any initialization sequences will be sent; if not
supplied, the file descriptor for ``sys.stdout`` will be used.
start_color()~
Must be called if the programmer wants to use colors, and before any other color
manipulation routine is called. It is good practice to call this routine right
after initscr.
start_color initializes eight basic colors (black, red, green, yellow,
blue, magenta, cyan, and white), and two global variables in the curses (|py2stdlib-curses|)
module, COLORS and COLOR_PAIRS, containing the maximum number
of colors and color-pairs the terminal can support. It also restores the colors
on the terminal to the values they had when the terminal was just turned on.
termattrs()~
Returns a logical OR of all video attributes supported by the terminal. This
information is useful when a curses program needs complete control over the
appearance of the screen.
termname()~
Returns the value of the environment variable TERM, truncated to 14 characters.
tigetflag(capname)~
Returns the value of the Boolean capability corresponding to the terminfo
capability name {capname}. The value ``-1`` is returned if {capname} is not a
Boolean capability, or ``0`` if it is canceled or absent from the terminal
description.
tigetnum(capname)~
Returns the value of the numeric capability corresponding to the terminfo
capability name {capname}. The value ``-2`` is returned if {capname} is not a
numeric capability, or ``-1`` if it is canceled or absent from the terminal
description.
tigetstr(capname)~
Returns the value of the string capability corresponding to the terminfo
capability name {capname}. ``None`` is returned if {capname} is not a string
capability, or is canceled or absent from the terminal description.
tparm(str[,...])~
Instantiates the string {str} with the supplied parameters, where {str} should
be a parameterized string obtained from the terminfo database. E.g.
``tparm(tigetstr("cup"), 5, 3)`` could result in ``'\033[6;4H'``, the exact
result depending on terminal type.
typeahead(fd)~
Specifies that the file descriptor {fd} be used for typeahead checking. If {fd}
is ``-1``, then no typeahead checking is done.
The curses library does "line-breakout optimization" by looking for typeahead
periodically while updating the screen. If input is found, and it is coming
from a tty, the current update is postponed until refresh or doupdate is called
again, allowing faster response to commands typed in advance. This function
allows specifying a different file descriptor for typeahead checking.
unctrl(ch)~
Returns a string which is a printable representation of the character {ch}.
Control characters are displayed as a caret followed by the character, for
example as ``^C``. Printing characters are left as they are.
ungetch(ch)~
Push {ch} so the next getch will return it.
.. note:: >
Only one {ch} can be pushed before getch is called.
<
ungetmouse(id, x, y, z, bstate)~
Push a KEY_MOUSE event onto the input queue, associating the given
state data with it.
use_env(flag)~
If used, this function should be called before initscr or newterm are
called. When {flag} is false, the values of lines and columns specified in the
terminfo database will be used, even if environment variables LINES
and COLUMNS (used by default) are set, or if curses is running in a
window (in which case default behavior would be to use the window size if
LINES and COLUMNS are not set).
use_default_colors()~
Allow use of default values for colors on terminals supporting this feature. Use
this to support transparency in your application. The default color is assigned
to the color number -1. After calling this function, ``init_pair(x,
curses.COLOR_RED, -1)`` initializes, for instance, color pair {x} to a red
foreground color on the default background.
Window Objects
--------------
Window objects, as returned by initscr and newwin above, have
the following methods:
window.addch([y, x,] ch[, attr])~
.. note:: >
A {character} means a C character (an ASCII code), rather then a Python
character (a string of length 1). (This note is true whenever the
documentation mentions a character.) The built-in ord is handy for
conveying strings to codes.
<
Paint character {ch} at ``(y, x)`` with attributes {attr}, overwriting any
character previously painter at that location. By default, the character
position and attributes are the current settings for the window object.
window.addnstr([y, x,] str, n[, attr])~
Paint at most {n} characters of the string {str} at ``(y, x)`` with attributes
{attr}, overwriting anything previously on the display.
window.addstr([y, x,] str[, attr])~
Paint the string {str} at ``(y, x)`` with attributes {attr}, overwriting
anything previously on the display.
window.attroff(attr)~
Remove attribute {attr} from the "background" set applied to all writes to the
current window.
window.attron(attr)~
Add attribute {attr} from the "background" set applied to all writes to the
current window.
window.attrset(attr)~
Set the "background" set of attributes to {attr}. This set is initially 0 (no
attributes).
window.bkgd(ch[, attr])~
Sets the background property of the window to the character {ch}, with
attributes {attr}. The change is then applied to every character position in
that window:
* The attribute of every character in the window is changed to the new
background attribute.
* Wherever the former background character appears, it is changed to the new
background character.
window.bkgdset(ch[, attr])~
Sets the window's background. A window's background consists of a character and
any combination of attributes. The attribute part of the background is combined
(OR'ed) with all non-blank characters that are written into the window. Both
the character and attribute parts of the background are combined with the blank
characters. The background becomes a property of the character and moves with
the character through any scrolling and insert/delete line/character operations.
window.border([ls[, rs[, ts[, bs[, tl[, tr[, bl[, br]]]]]]]])~
Draw a border around the edges of the window. Each parameter specifies the
character to use for a specific part of the border; see the table below for more
details. The characters can be specified as integers or as one-character
strings.
.. note:: >
A ``0`` value for any parameter will cause the default character to be used for
that parameter. Keyword parameters can {not} be used. The defaults are listed
in this table:
<
+-----------+---------------------+-----------------------+
| Parameter | Description | Default value |
+===========+=====================+=======================+
| {ls} | Left side | ACS_VLINE |
+-----------+---------------------+-----------------------+
| {rs} | Right side | ACS_VLINE |
+-----------+---------------------+-----------------------+
| {ts} | Top | ACS_HLINE |
+-----------+---------------------+-----------------------+
| {bs} | Bottom | ACS_HLINE |
+-----------+---------------------+-----------------------+
| {tl} | Upper-left corner | ACS_ULCORNER |
+-----------+---------------------+-----------------------+
| {tr} | Upper-right corner | ACS_URCORNER |
+-----------+---------------------+-----------------------+
| {bl} | Bottom-left corner | ACS_LLCORNER |
+-----------+---------------------+-----------------------+
| {br} | Bottom-right corner | ACS_LRCORNER |
+-----------+---------------------+-----------------------+
window.box([vertch, horch])~
Similar to border, but both {ls} and {rs} are {vertch} and both {ts} and
bs are {horch}. The default corner characters are always used by this function.
window.chgat([y, x, ] [num,] attr)~
Sets the attributes of {num} characters at the current cursor position, or at
position ``(y, x)`` if supplied. If no value of {num} is given or {num} = -1,
the attribute will be set on all the characters to the end of the line. This
function does not move the cursor. The changed line will be touched using the
touchline method so that the contents will be redisplayed by the next
window refresh.
window.clear()~
Like erase, but also causes the whole window to be repainted upon next
call to refresh.
window.clearok(yes)~
If {yes} is 1, the next call to refresh will clear the window
completely.
window.clrtobot()~
Erase from cursor to the end of the window: all lines below the cursor are
deleted, and then the equivalent of clrtoeol is performed.
window.clrtoeol()~
Erase from cursor to the end of the line.
window.cursyncup()~
Updates the current cursor position of all the ancestors of the window to
reflect the current cursor position of the window.
window.delch([y, x])~
Delete any character at ``(y, x)``.
window.deleteln()~
Delete the line under the cursor. All following lines are moved up by 1 line.
window.derwin([nlines, ncols,] begin_y, begin_x)~
An abbreviation for "derive window", derwin is the same as calling
subwin, except that {begin_y} and {begin_x} are relative to the origin
of the window, rather than relative to the entire screen. Returns a window
object for the derived window.
window.echochar(ch[, attr])~
Add character {ch} with attribute {attr}, and immediately call refresh
on the window.
window.enclose(y, x)~
Tests whether the given pair of screen-relative character-cell coordinates are
enclosed by the given window, returning true or false. It is useful for
determining what subset of the screen windows enclose the location of a mouse
event.
window.erase()~
Clear the window.
window.getbegyx()~
Return a tuple ``(y, x)`` of co-ordinates of upper-left corner.
window.getch([y, x])~
Get a character. Note that the integer returned does {not} have to be in ASCII
range: function keys, keypad keys and so on return numbers higher than 256. In
no-delay mode, -1 is returned if there is no input, else getch waits
until a key is pressed.
window.getkey([y, x])~
Get a character, returning a string instead of an integer, as getch
does. Function keys, keypad keys and so on return a multibyte string containing
the key name. In no-delay mode, an exception is raised if there is no input.
window.getmaxyx()~
Return a tuple ``(y, x)`` of the height and width of the window.
window.getparyx()~
Returns the beginning coordinates of this window relative to its parent window
into two integer variables y and x. Returns ``-1,-1`` if this window has no
parent.
window.getstr([y, x])~
Read a string from the user, with primitive line editing capacity.
window.getyx()~
Return a tuple ``(y, x)`` of current cursor position relative to the window's
upper-left corner.
window.hline([y, x,] ch, n)~
Display a horizontal line starting at ``(y, x)`` with length {n} consisting of
the character {ch}.
window.idcok(flag)~
If {flag} is false, curses no longer considers using the hardware insert/delete
character feature of the terminal; if {flag} is true, use of character insertion
and deletion is enabled. When curses is first initialized, use of character
insert/delete is enabled by default.
window.idlok(yes)~
If called with {yes} equal to 1, curses (|py2stdlib-curses|) will try and use hardware line
editing facilities. Otherwise, line insertion/deletion are disabled.
window.immedok(flag)~
If {flag} is true, any change in the window image automatically causes the
window to be refreshed; you no longer have to call refresh yourself.
However, it may degrade performance considerably, due to repeated calls to
wrefresh. This option is disabled by default.
window.inch([y, x])~
Return the character at the given position in the window. The bottom 8 bits are
the character proper, and upper bits are the attributes.
window.insch([y, x,] ch[, attr])~
Paint character {ch} at ``(y, x)`` with attributes {attr}, moving the line from
position {x} right by one character.
window.insdelln(nlines)~
Inserts {nlines} lines into the specified window above the current line. The
{nlines} bottom lines are lost. For negative {nlines}, delete {nlines} lines
starting with the one under the cursor, and move the remaining lines up. The
bottom {nlines} lines are cleared. The current cursor position remains the
same.
window.insertln()~
Insert a blank line under the cursor. All following lines are moved down by 1
line.
window.insnstr([y, x,] str, n [, attr])~
Insert a character string (as many characters as will fit on the line) before
the character under the cursor, up to {n} characters. If {n} is zero or
negative, the entire string is inserted. All characters to the right of the
cursor are shifted right, with the rightmost characters on the line being lost.
The cursor position does not change (after moving to {y}, {x}, if specified).
window.insstr([y, x, ] str [, attr])~
Insert a character string (as many characters as will fit on the line) before
the character under the cursor. All characters to the right of the cursor are
shifted right, with the rightmost characters on the line being lost. The cursor
position does not change (after moving to {y}, {x}, if specified).
window.instr([y, x] [, n])~
Returns a string of characters, extracted from the window starting at the
current cursor position, or at {y}, {x} if specified. Attributes are stripped
from the characters. If {n} is specified, instr returns return a string
at most {n} characters long (exclusive of the trailing NUL).
window.is_linetouched(line)~
Returns true if the specified line was modified since the last call to
refresh; otherwise returns false. Raises a curses.error
exception if {line} is not valid for the given window.
window.is_wintouched()~
Returns true if the specified window was modified since the last call to
refresh; otherwise returns false.
window.keypad(yes)~
If {yes} is 1, escape sequences generated by some keys (keypad, function keys)
will be interpreted by curses (|py2stdlib-curses|). If {yes} is 0, escape sequences will be
left as is in the input stream.
window.leaveok(yes)~
If {yes} is 1, cursor is left where it is on update, instead of being at "cursor
position." This reduces cursor movement where possible. If possible the cursor
will be made invisible.
If {yes} is 0, cursor will always be at "cursor position" after an update.
window.move(new_y, new_x)~
Move cursor to ``(new_y, new_x)``.
window.mvderwin(y, x)~
Moves the window inside its parent window. The screen-relative parameters of
the window are not changed. This routine is used to display different parts of
the parent window at the same physical position on the screen.
window.mvwin(new_y, new_x)~
Move the window so its upper-left corner is at ``(new_y, new_x)``.
window.nodelay(yes)~
If {yes} is ``1``, getch will be non-blocking.
window.notimeout(yes)~
If {yes} is ``1``, escape sequences will not be timed out.
If {yes} is ``0``, after a few milliseconds, an escape sequence will not be
interpreted, and will be left in the input stream as is.
window.noutrefresh()~
Mark for refresh but wait. This function updates the data structure
representing the desired state of the window, but does not force an update of
the physical screen. To accomplish that, call doupdate.
window.overlay(destwin[, sminrow, smincol, dminrow, dmincol, dmaxrow, dmaxcol])~
Overlay the window on top of {destwin}. The windows need not be the same size,
only the overlapping region is copied. This copy is non-destructive, which means
that the current background character does not overwrite the old contents of
{destwin}.
To get fine-grained control over the copied region, the second form of
overlay can be used. {sminrow} and {smincol} are the upper-left
coordinates of the source window, and the other variables mark a rectangle in
the destination window.
window.overwrite(destwin[, sminrow, smincol, dminrow, dmincol, dmaxrow, dmaxcol])~
Overwrite the window on top of {destwin}. The windows need not be the same size,
in which case only the overlapping region is copied. This copy is destructive,
which means that the current background character overwrites the old contents of
{destwin}.
To get fine-grained control over the copied region, the second form of
overwrite can be used. {sminrow} and {smincol} are the upper-left
coordinates of the source window, the other variables mark a rectangle in the
destination window.
window.putwin(file)~
Writes all data associated with the window into the provided file object. This
information can be later retrieved using the getwin function.
window.redrawln(beg, num)~
Indicates that the {num} screen lines, starting at line {beg}, are corrupted and
should be completely redrawn on the next refresh call.
window.redrawwin()~
Touches the entire window, causing it to be completely redrawn on the next
refresh call.
window.refresh([pminrow, pmincol, sminrow, smincol, smaxrow, smaxcol])~
Update the display immediately (sync actual screen with previous
drawing/deleting methods).
The 6 optional arguments can only be specified when the window is a pad created
with newpad. The additional parameters are needed to indicate what part
of the pad and screen are involved. {pminrow} and {pmincol} specify the upper
left-hand corner of the rectangle to be displayed in the pad. {sminrow},
{smincol}, {smaxrow}, and {smaxcol} specify the edges of the rectangle to be
displayed on the screen. The lower right-hand corner of the rectangle to be
displayed in the pad is calculated from the screen coordinates, since the
rectangles must be the same size. Both rectangles must be entirely contained
within their respective structures. Negative values of {pminrow}, {pmincol},
{sminrow}, or {smincol} are treated as if they were zero.
window.scroll([lines=1])~
Scroll the screen or scrolling region upward by {lines} lines.
window.scrollok(flag)~
Controls what happens when the cursor of a window is moved off the edge of the
window or scrolling region, either as a result of a newline action on the bottom
line, or typing the last character of the last line. If {flag} is false, the
cursor is left on the bottom line. If {flag} is true, the window is scrolled up
one line. Note that in order to get the physical scrolling effect on the
terminal, it is also necessary to call idlok.
window.setscrreg(top, bottom)~
Set the scrolling region from line {top} to line {bottom}. All scrolling actions
will take place in this region.
window.standend()~
Turn off the standout attribute. On some terminals this has the side effect of
turning off all attributes.
window.standout()~
Turn on attribute {A_STANDOUT}.
window.subpad([nlines, ncols,] begin_y, begin_x)~
Return a sub-window, whose upper-left corner is at ``(begin_y, begin_x)``, and
whose width/height is {ncols}/{nlines}.
window.subwin([nlines, ncols,] begin_y, begin_x)~
Return a sub-window, whose upper-left corner is at ``(begin_y, begin_x)``, and
whose width/height is {ncols}/{nlines}.
By default, the sub-window will extend from the specified position to the lower
right corner of the window.
window.syncdown()~
Touches each location in the window that has been touched in any of its ancestor
windows. This routine is called by refresh, so it should almost never
be necessary to call it manually.
window.syncok(flag)~
If called with {flag} set to true, then syncup is called automatically
whenever there is a change in the window.
window.syncup()~
Touches all locations in ancestors of the window that have been changed in the
window.
window.timeout(delay)~
Sets blocking or non-blocking read behavior for the window. If {delay} is
negative, blocking read is used (which will wait indefinitely for input). If
{delay} is zero, then non-blocking read is used, and -1 will be returned by
getch if no input is waiting. If {delay} is positive, then
getch will block for {delay} milliseconds, and return -1 if there is
still no input at the end of that time.
window.touchline(start, count[, changed])~
Pretend {count} lines have been changed, starting with line {start}. If
{changed} is supplied, it specifies whether the affected lines are marked as
having been changed ({changed}\ =1) or unchanged ({changed}\ =0).
window.touchwin()~
Pretend the whole window has been changed, for purposes of drawing
optimizations.
window.untouchwin()~
Marks all lines in the window as unchanged since the last call to
refresh.
window.vline([y, x,] ch, n)~
Display a vertical line starting at ``(y, x)`` with length {n} consisting of the
character {ch}.
Constants
---------
The curses (|py2stdlib-curses|) module defines the following data members:
ERR~
Some curses routines that return an integer, such as getch, return
ERR upon failure.
OK~
Some curses routines that return an integer, such as napms, return
OK upon success.
version~
A string representing the current version of the module. Also available as
__version__.
Several constants are available to specify character cell attributes:
+------------------+-------------------------------+
| Attribute | Meaning |
+==================+===============================+
| ``A_ALTCHARSET`` | Alternate character set mode. |
+------------------+-------------------------------+
| ``A_BLINK`` | Blink mode. |
+------------------+-------------------------------+
| ``A_BOLD`` | Bold mode. |
+------------------+-------------------------------+
| ``A_DIM`` | Dim mode. |
+------------------+-------------------------------+
| ``A_NORMAL`` | Normal attribute. |
+------------------+-------------------------------+
| ``A_STANDOUT`` | Standout mode. |
+------------------+-------------------------------+
| ``A_UNDERLINE`` | Underline mode. |
+------------------+-------------------------------+
Keys are referred to by integer constants with names starting with ``KEY_``.
The exact keycaps available are system dependent.
.. XXX this table is far too large! should it be alphabetized?
+-------------------+--------------------------------------------+
| Key constant | Key |
+===================+============================================+
| ``KEY_MIN`` | Minimum key value |
+-------------------+--------------------------------------------+
| ``KEY_BREAK`` | Break key (unreliable) |
+-------------------+--------------------------------------------+
| ``KEY_DOWN`` | Down-arrow |
+-------------------+--------------------------------------------+
| ``KEY_UP`` | Up-arrow |
+-------------------+--------------------------------------------+
| ``KEY_LEFT`` | Left-arrow |
+-------------------+--------------------------------------------+
| ``KEY_RIGHT`` | Right-arrow |
+-------------------+--------------------------------------------+
| ``KEY_HOME`` | Home key (upward+left arrow) |
+-------------------+--------------------------------------------+
| ``KEY_BACKSPACE`` | Backspace (unreliable) |
+-------------------+--------------------------------------------+
| ``KEY_F0`` | Function keys. Up to 64 function keys are |
| | supported. |
+-------------------+--------------------------------------------+
| ``KEY_Fn`` | Value of function key {n} |
+-------------------+--------------------------------------------+
| ``KEY_DL`` | Delete line |
+-------------------+--------------------------------------------+
| ``KEY_IL`` | Insert line |
+-------------------+--------------------------------------------+
| ``KEY_DC`` | Delete character |
+-------------------+--------------------------------------------+
| ``KEY_IC`` | Insert char or enter insert mode |
+-------------------+--------------------------------------------+
| ``KEY_EIC`` | Exit insert char mode |
+-------------------+--------------------------------------------+
| ``KEY_CLEAR`` | Clear screen |
+-------------------+--------------------------------------------+
| ``KEY_EOS`` | Clear to end of screen |
+-------------------+--------------------------------------------+
| ``KEY_EOL`` | Clear to end of line |
+-------------------+--------------------------------------------+
| ``KEY_SF`` | Scroll 1 line forward |
+-------------------+--------------------------------------------+
| ``KEY_SR`` | Scroll 1 line backward (reverse) |
+-------------------+--------------------------------------------+
| ``KEY_NPAGE`` | Next page |
+-------------------+--------------------------------------------+
| ``KEY_PPAGE`` | Previous page |
+-------------------+--------------------------------------------+
| ``KEY_STAB`` | Set tab |
+-------------------+--------------------------------------------+
| ``KEY_CTAB`` | Clear tab |
+-------------------+--------------------------------------------+
| ``KEY_CATAB`` | Clear all tabs |
+-------------------+--------------------------------------------+
| ``KEY_ENTER`` | Enter or send (unreliable) |
+-------------------+--------------------------------------------+
| ``KEY_SRESET`` | Soft (partial) reset (unreliable) |
+-------------------+--------------------------------------------+
| ``KEY_RESET`` | Reset or hard reset (unreliable) |
+-------------------+--------------------------------------------+
| ``KEY_PRINT`` | Print |
+-------------------+--------------------------------------------+
| ``KEY_LL`` | Home down or bottom (lower left) |
+-------------------+--------------------------------------------+
| ``KEY_A1`` | Upper left of keypad |
+-------------------+--------------------------------------------+
| ``KEY_A3`` | Upper right of keypad |
+-------------------+--------------------------------------------+
| ``KEY_B2`` | Center of keypad |
+-------------------+--------------------------------------------+
| ``KEY_C1`` | Lower left of keypad |
+-------------------+--------------------------------------------+
| ``KEY_C3`` | Lower right of keypad |
+-------------------+--------------------------------------------+
| ``KEY_BTAB`` | Back tab |
+-------------------+--------------------------------------------+
| ``KEY_BEG`` | Beg (beginning) |
+-------------------+--------------------------------------------+
| ``KEY_CANCEL`` | Cancel |
+-------------------+--------------------------------------------+
| ``KEY_CLOSE`` | Close |
+-------------------+--------------------------------------------+
| ``KEY_COMMAND`` | Cmd (command) |
+-------------------+--------------------------------------------+
| ``KEY_COPY`` | Copy |
+-------------------+--------------------------------------------+
| ``KEY_CREATE`` | Create |
+-------------------+--------------------------------------------+
| ``KEY_END`` | End |
+-------------------+--------------------------------------------+
| ``KEY_EXIT`` | Exit |
+-------------------+--------------------------------------------+
| ``KEY_FIND`` | Find |
+-------------------+--------------------------------------------+
| ``KEY_HELP`` | Help |
+-------------------+--------------------------------------------+
| ``KEY_MARK`` | Mark |
+-------------------+--------------------------------------------+
| ``KEY_MESSAGE`` | Message |
+-------------------+--------------------------------------------+
| ``KEY_MOVE`` | Move |
+-------------------+--------------------------------------------+
| ``KEY_NEXT`` | Next |
+-------------------+--------------------------------------------+
| ``KEY_OPEN`` | Open |
+-------------------+--------------------------------------------+
| ``KEY_OPTIONS`` | Options |
+-------------------+--------------------------------------------+
| ``KEY_PREVIOUS`` | Prev (previous) |
+-------------------+--------------------------------------------+
| ``KEY_REDO`` | Redo |
+-------------------+--------------------------------------------+
| ``KEY_REFERENCE`` | Ref (reference) |
+-------------------+--------------------------------------------+
| ``KEY_REFRESH`` | Refresh |
+-------------------+--------------------------------------------+
| ``KEY_REPLACE`` | Replace |
+-------------------+--------------------------------------------+
| ``KEY_RESTART`` | Restart |
+-------------------+--------------------------------------------+
| ``KEY_RESUME`` | Resume |
+-------------------+--------------------------------------------+
| ``KEY_SAVE`` | Save |
+-------------------+--------------------------------------------+
| ``KEY_SBEG`` | Shifted Beg (beginning) |
+-------------------+--------------------------------------------+
| ``KEY_SCANCEL`` | Shifted Cancel |
+-------------------+--------------------------------------------+
| ``KEY_SCOMMAND`` | Shifted Command |
+-------------------+--------------------------------------------+
| ``KEY_SCOPY`` | Shifted Copy |
+-------------------+--------------------------------------------+
| ``KEY_SCREATE`` | Shifted Create |
+-------------------+--------------------------------------------+
| ``KEY_SDC`` | Shifted Delete char |
+-------------------+--------------------------------------------+
| ``KEY_SDL`` | Shifted Delete line |
+-------------------+--------------------------------------------+
| ``KEY_SELECT`` | Select |
+-------------------+--------------------------------------------+
| ``KEY_SEND`` | Shifted End |
+-------------------+--------------------------------------------+
| ``KEY_SEOL`` | Shifted Clear line |
+-------------------+--------------------------------------------+
| ``KEY_SEXIT`` | Shifted Dxit |
+-------------------+--------------------------------------------+
| ``KEY_SFIND`` | Shifted Find |
+-------------------+--------------------------------------------+
| ``KEY_SHELP`` | Shifted Help |
+-------------------+--------------------------------------------+
| ``KEY_SHOME`` | Shifted Home |
+-------------------+--------------------------------------------+
| ``KEY_SIC`` | Shifted Input |
+-------------------+--------------------------------------------+
| ``KEY_SLEFT`` | Shifted Left arrow |
+-------------------+--------------------------------------------+
| ``KEY_SMESSAGE`` | Shifted Message |
+-------------------+--------------------------------------------+
| ``KEY_SMOVE`` | Shifted Move |
+-------------------+--------------------------------------------+
| ``KEY_SNEXT`` | Shifted Next |
+-------------------+--------------------------------------------+
| ``KEY_SOPTIONS`` | Shifted Options |
+-------------------+--------------------------------------------+
| ``KEY_SPREVIOUS`` | Shifted Prev |
+-------------------+--------------------------------------------+
| ``KEY_SPRINT`` | Shifted Print |
+-------------------+--------------------------------------------+
| ``KEY_SREDO`` | Shifted Redo |
+-------------------+--------------------------------------------+
| ``KEY_SREPLACE`` | Shifted Replace |
+-------------------+--------------------------------------------+
| ``KEY_SRIGHT`` | Shifted Right arrow |
+-------------------+--------------------------------------------+
| ``KEY_SRSUME`` | Shifted Resume |
+-------------------+--------------------------------------------+
| ``KEY_SSAVE`` | Shifted Save |
+-------------------+--------------------------------------------+
| ``KEY_SSUSPEND`` | Shifted Suspend |
+-------------------+--------------------------------------------+
| ``KEY_SUNDO`` | Shifted Undo |
+-------------------+--------------------------------------------+
| ``KEY_SUSPEND`` | Suspend |
+-------------------+--------------------------------------------+
| ``KEY_UNDO`` | Undo |
+-------------------+--------------------------------------------+
| ``KEY_MOUSE`` | Mouse event has occurred |
+-------------------+--------------------------------------------+
| ``KEY_RESIZE`` | Terminal resize event |
+-------------------+--------------------------------------------+
| ``KEY_MAX`` | Maximum key value |
+-------------------+--------------------------------------------+
On VT100s and their software emulations, such as X terminal emulators, there are
normally at least four function keys (KEY_F1, KEY_F2,
KEY_F3, KEY_F4) available, and the arrow keys mapped to
KEY_UP, KEY_DOWN, KEY_LEFT and KEY_RIGHT in
the obvious way. If your machine has a PC keyboard, it is safe to expect arrow
keys and twelve function keys (older PC keyboards may have only ten function
keys); also, the following keypad mappings are standard:
+------------------+-----------+
| Keycap | Constant |
+==================+===========+
| Insert | KEY_IC |
+------------------+-----------+
| Delete | KEY_DC |
+------------------+-----------+
| Home | KEY_HOME |
+------------------+-----------+
| End | KEY_END |
+------------------+-----------+
| Page Up | KEY_NPAGE |
+------------------+-----------+
| Page Down | KEY_PPAGE |
+------------------+-----------+
The following table lists characters from the alternate character set. These are
inherited from the VT100 terminal, and will generally be available on software
emulations such as X terminals. When there is no graphic available, curses
falls back on a crude printable ASCII approximation.
.. note::
These are available only after initscr has been called.
+------------------+------------------------------------------+
| ACS code | Meaning |
+==================+==========================================+
| ``ACS_BBSS`` | alternate name for upper right corner |
+------------------+------------------------------------------+
| ``ACS_BLOCK`` | solid square block |
+------------------+------------------------------------------+
| ``ACS_BOARD`` | board of squares |
+------------------+------------------------------------------+
| ``ACS_BSBS`` | alternate name for horizontal line |
+------------------+------------------------------------------+
| ``ACS_BSSB`` | alternate name for upper left corner |
+------------------+------------------------------------------+
| ``ACS_BSSS`` | alternate name for top tee |
+------------------+------------------------------------------+
| ``ACS_BTEE`` | bottom tee |
+------------------+------------------------------------------+
| ``ACS_BULLET`` | bullet |
+------------------+------------------------------------------+
| ``ACS_CKBOARD`` | checker board (stipple) |
+------------------+------------------------------------------+
| ``ACS_DARROW`` | arrow pointing down |
+------------------+------------------------------------------+
| ``ACS_DEGREE`` | degree symbol |
+------------------+------------------------------------------+
| ``ACS_DIAMOND`` | diamond |
+------------------+------------------------------------------+
| ``ACS_GEQUAL`` | greater-than-or-equal-to |
+------------------+------------------------------------------+
| ``ACS_HLINE`` | horizontal line |
+------------------+------------------------------------------+
| ``ACS_LANTERN`` | lantern symbol |
+------------------+------------------------------------------+
| ``ACS_LARROW`` | left arrow |
+------------------+------------------------------------------+
| ``ACS_LEQUAL`` | less-than-or-equal-to |
+------------------+------------------------------------------+
| ``ACS_LLCORNER`` | lower left-hand corner |
+------------------+------------------------------------------+
| ``ACS_LRCORNER`` | lower right-hand corner |
+------------------+------------------------------------------+
| ``ACS_LTEE`` | left tee |
+------------------+------------------------------------------+
| ``ACS_NEQUAL`` | not-equal sign |
+------------------+------------------------------------------+
| ``ACS_PI`` | letter pi |
+------------------+------------------------------------------+
| ``ACS_PLMINUS`` | plus-or-minus sign |
+------------------+------------------------------------------+
| ``ACS_PLUS`` | big plus sign |
+------------------+------------------------------------------+
| ``ACS_RARROW`` | right arrow |
+------------------+------------------------------------------+
| ``ACS_RTEE`` | right tee |
+------------------+------------------------------------------+
| ``ACS_S1`` | scan line 1 |
+------------------+------------------------------------------+
| ``ACS_S3`` | scan line 3 |
+------------------+------------------------------------------+
| ``ACS_S7`` | scan line 7 |
+------------------+------------------------------------------+
| ``ACS_S9`` | scan line 9 |
+------------------+------------------------------------------+
| ``ACS_SBBS`` | alternate name for lower right corner |
+------------------+------------------------------------------+
| ``ACS_SBSB`` | alternate name for vertical line |
+------------------+------------------------------------------+
| ``ACS_SBSS`` | alternate name for right tee |
+------------------+------------------------------------------+
| ``ACS_SSBB`` | alternate name for lower left corner |
+------------------+------------------------------------------+
| ``ACS_SSBS`` | alternate name for bottom tee |
+------------------+------------------------------------------+
| ``ACS_SSSB`` | alternate name for left tee |
+------------------+------------------------------------------+
| ``ACS_SSSS`` | alternate name for crossover or big plus |
+------------------+------------------------------------------+
| ``ACS_STERLING`` | pound sterling |
+------------------+------------------------------------------+
| ``ACS_TTEE`` | top tee |
+------------------+------------------------------------------+
| ``ACS_UARROW`` | up arrow |
+------------------+------------------------------------------+
| ``ACS_ULCORNER`` | upper left corner |
+------------------+------------------------------------------+
| ``ACS_URCORNER`` | upper right corner |
+------------------+------------------------------------------+
| ``ACS_VLINE`` | vertical line |
+------------------+------------------------------------------+
The following table lists the predefined colors:
+-------------------+----------------------------+
| Constant | Color |
+===================+============================+
| ``COLOR_BLACK`` | Black |
+-------------------+----------------------------+
| ``COLOR_BLUE`` | Blue |
+-------------------+----------------------------+
| ``COLOR_CYAN`` | Cyan (light greenish blue) |
+-------------------+----------------------------+
| ``COLOR_GREEN`` | Green |
+-------------------+----------------------------+
| ``COLOR_MAGENTA`` | Magenta (purplish red) |
+-------------------+----------------------------+
| ``COLOR_RED`` | Red |
+-------------------+----------------------------+
| ``COLOR_WHITE`` | White |
+-------------------+----------------------------+
| ``COLOR_YELLOW`` | Yellow |
+-------------------+----------------------------+
curses.textpad (|py2stdlib-curses.textpad|) --- Text input widget for curses programs
===============================================================
==============================================================================
*py2stdlib-curses.textpad*
curses.textpad~
:synopsis: Emacs-like input editing in a curses window.
.. versionadded:: 1.6
The curses.textpad (|py2stdlib-curses.textpad|) module provides a Textbox class that handles
elementary text editing in a curses window, supporting a set of keybindings
resembling those of Emacs (thus, also of Netscape Navigator, BBedit 6.x,
FrameMaker, and many other programs). The module also provides a
rectangle-drawing function useful for framing text boxes or for other purposes.
The module curses.textpad (|py2stdlib-curses.textpad|) defines the following function:
rectangle(win, uly, ulx, lry, lrx)~
Draw a rectangle. The first argument must be a window object; the remaining
arguments are coordinates relative to that window. The second and third
arguments are the y and x coordinates of the upper left hand corner of the
rectangle to be drawn; the fourth and fifth arguments are the y and x
coordinates of the lower right hand corner. The rectangle will be drawn using
VT100/IBM PC forms characters on terminals that make this possible (including
xterm and most other software terminal emulators). Otherwise it will be drawn
with ASCII dashes, vertical bars, and plus signs.
Textbox objects
---------------
You can instantiate a Textbox object as follows:
Textbox(win)~
Return a textbox widget object. The {win} argument should be a curses
WindowObject in which the textbox is to be contained. The edit cursor
of the textbox is initially located at the upper left hand corner of the
containing window, with coordinates ``(0, 0)``. The instance's
stripspaces flag is initially on.
Textbox objects have the following methods:
edit([validator])~
This is the entry point you will normally use. It accepts editing
keystrokes until one of the termination keystrokes is entered. If
{validator} is supplied, it must be a function. It will be called for
each keystroke entered with the keystroke as a parameter; command dispatch
is done on the result. This method returns the window contents as a
string; whether blanks in the window are included is affected by the
stripspaces member.
do_command(ch)~
Process a single command keystroke. Here are the supported special
keystrokes:
+------------------+-------------------------------------------+
| Keystroke | Action |
+==================+===========================================+
| Control-A | Go to left edge of window. |
+------------------+-------------------------------------------+
| Control-B | Cursor left, wrapping to previous line if |
| | appropriate. |
+------------------+-------------------------------------------+
| Control-D | Delete character under cursor. |
+------------------+-------------------------------------------+
| Control-E | Go to right edge (stripspaces off) or end |
| | of line (stripspaces on). |
+------------------+-------------------------------------------+
| Control-F | Cursor right, wrapping to next line when |
| | appropriate. |
+------------------+-------------------------------------------+
| Control-G | Terminate, returning the window contents. |
+------------------+-------------------------------------------+
| Control-H | Delete character backward. |
+------------------+-------------------------------------------+
| Control-J | Terminate if the window is 1 line, |
| | otherwise insert newline. |
+------------------+-------------------------------------------+
| Control-K | If line is blank, delete it, otherwise |
| | clear to end of line. |
+------------------+-------------------------------------------+
| Control-L | Refresh screen. |
+------------------+-------------------------------------------+
| Control-N | Cursor down; move down one line. |
+------------------+-------------------------------------------+
| Control-O | Insert a blank line at cursor location. |
+------------------+-------------------------------------------+
| Control-P | Cursor up; move up one line. |
+------------------+-------------------------------------------+
Move operations do nothing if the cursor is at an edge where the movement
is not possible. The following synonyms are supported where possible:
+------------------------+------------------+
| Constant | Keystroke |
+========================+==================+
| KEY_LEFT | Control-B |
+------------------------+------------------+
| KEY_RIGHT | Control-F |
+------------------------+------------------+
| KEY_UP | Control-P |
+------------------------+------------------+
| KEY_DOWN | Control-N |
+------------------------+------------------+
| KEY_BACKSPACE | Control-h |
+------------------------+------------------+
All other keystrokes are treated as a command to insert the given
character and move right (with line wrapping).
gather()~
This method returns the window contents as a string; whether blanks in the
window are included is affected by the stripspaces member.
stripspaces~
This data member is a flag which controls the interpretation of blanks in
the window. When it is on, trailing blanks on each line are ignored; any
cursor motion that would land the cursor on a trailing blank goes to the
end of that line instead, and trailing blanks are stripped when the window
contents are gathered.
curses.wrapper (|py2stdlib-curses.wrapper|) --- Terminal handler for curses programs
==============================================================
==============================================================================
*py2stdlib-curses.wrapper*
curses.wrapper~
:synopsis: Terminal configuration wrapper for curses programs.
.. versionadded:: 1.6
This module supplies one function, wrapper, which runs another function
which should be the rest of your curses-using application. If the application
raises an exception, wrapper will restore the terminal to a sane state
before re-raising the exception and generating a traceback.
wrapper(func, ...)~
Wrapper function that initializes curses and calls another function, {func},
restoring normal keyboard/screen behavior on error. The callable object {func}
is then passed the main window 'stdscr' as its first argument, followed by any
other arguments passed to wrapper.
Before calling the hook function, wrapper turns on cbreak mode, turns
off echo, enables the terminal keypad, and initializes colors if the terminal
has color support. On exit (whether normally or by exception) it restores
cooked mode, turns on echo, and disables the terminal keypad.
==============================================================================
*py2stdlib-cpickle*
cPickle~
:synopsis: Faster version of pickle, but not subclassable.
.. index:: module: pickle
The cPickle (|py2stdlib-cpickle|) module supports serialization and de-serialization of Python
objects, providing an interface and functionality nearly identical to the
pickle (|py2stdlib-pickle|) module. There are several differences, the most important being
performance and subclassability.
First, cPickle (|py2stdlib-cpickle|) can be up to 1000 times faster than pickle (|py2stdlib-pickle|) because
the former is implemented in C. Second, in the cPickle (|py2stdlib-cpickle|) module the
callables Pickler and Unpickler are functions, not classes.
This means that you cannot use them to derive custom pickling and unpickling
subclasses. Most applications have no need for this functionality and should
benefit from the greatly improved performance of the cPickle (|py2stdlib-cpickle|) module.
The pickle data stream produced by pickle (|py2stdlib-pickle|) and cPickle (|py2stdlib-cpickle|) are
identical, so it is possible to use pickle (|py2stdlib-pickle|) and cPickle (|py2stdlib-cpickle|)
interchangeably with existing pickles. [#]_
There are additional minor differences in API between cPickle (|py2stdlib-cpickle|) and
pickle (|py2stdlib-pickle|), however for most applications, they are interchangeable. More
documentation is provided in the pickle (|py2stdlib-pickle|) module documentation, which
includes a list of the documented differences.
.. rubric:: Footnotes
.. [#] Don't confuse this with the marshal (|py2stdlib-marshal|) module
.. [#] In the pickle (|py2stdlib-pickle|) module these callables are classes, which you could
subclass to customize the behavior. However, in the cPickle (|py2stdlib-cpickle|) module these
callables are factory functions and so cannot be subclassed. One common reason
to subclass is to control what objects can actually be unpickled. See section
pickle-sub for more details.
.. [#] {Warning}: this is intended for pickling multiple objects without intervening
modifications to the objects or their parts. If you modify an object and then
pickle it again using the same Pickler instance, the object is not
pickled again --- a reference to it is pickled and the Unpickler will
return the old value, not the modified one. There are two problems here: (1)
detecting changes, and (2) marshalling a minimal set of changes. Garbage
Collection may also become a problem here.
.. [#] The exception raised will likely be an ImportError or an
AttributeError but it could be something else.
.. [#] These methods can also be used to implement copying class instances.
.. [#] This protocol is also used by the shallow and deep copying operations defined in
the copy (|py2stdlib-copy|) module.
.. [#] The actual mechanism for associating these user defined functions is slightly
different for pickle (|py2stdlib-pickle|) and cPickle (|py2stdlib-cpickle|). The description given here
works the same for both implementations. Users of the pickle (|py2stdlib-pickle|) module
could also use subclassing to effect the same results, overriding the
persistent_id and persistent_load methods in the derived
classes.
.. [#] We'll leave you with the image of Guido and Jim sitting around sniffing pickles
in their living rooms.
.. [#] A word of caution: the mechanisms described here use internal attributes and
methods, which are subject to change in future versions of Python. We intend to
someday provide a common interface for controlling this behavior, which will
work in either pickle (|py2stdlib-pickle|) or cPickle (|py2stdlib-cpickle|).
.. [#] Since the pickle data format is actually a tiny stack-oriented programming
language, and some freedom is taken in the encodings of certain objects, it is
possible that the two modules produce different data streams for the same input
objects. However it is guaranteed that they will always be able to read each
other's data streams.
==============================================================================
*py2stdlib-cprofile*
cProfile~
:synopsis: Python profiler
The primary entry point for the profiler is the global function
profile.run (resp. cProfile.run). It is typically used to create
any profile information. The reports are formatted and printed using methods of
the class pstats.Stats. The following is a description of all of these
standard entry points and functions. For a more in-depth view of some of the
code, consider reading the later section on Profiler Extensions, which includes
discussion of how to derive "better" profilers from the classes presented, or
reading the source code for these modules.
run(command[, filename])~
This function takes a single argument that can be passed to the
exec statement, and an optional file name. In all cases this
routine attempts to exec its first argument, and gather profiling
statistics from the execution. If no file name is present, then this function
automatically prints a simple profiling report, sorted by the standard name
string (file/line/function-name) that is presented in each line. The
following is a typical output from such a call:: >
2706 function calls (2004 primitive calls) in 4.504 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
2 0.006 0.003 0.953 0.477 pobject.py:75(save_objects)
43/3 0.533 0.012 0.749 0.250 pobject.py:99(evaluate)
...
<
The first line indicates that 2706 calls were monitored. Of those calls, 2004
were primitive. We define primitive to mean that the call was not
induced via recursion. The next line: ``Ordered by: standard name``, indicates
that the text string in the far right column was used to sort the output. The
column headings include:
ncalls
for the number of calls,
tottime
for the total time spent in the given function (and excluding time made in calls
to sub-functions),
percall
is the quotient of ``tottime`` divided by ``ncalls``
cumtime
is the total time spent in this and all subfunctions (from invocation till
exit). This figure is accurate {even} for recursive functions.
percall
is the quotient of ``cumtime`` divided by primitive calls
filename:lineno(function)
provides the respective data of each function
When there are two numbers in the first column (for example, ``43/3``), then the
latter is the number of primitive calls, and the former is the actual number of
calls. Note that when the function does not recurse, these two values are the
same, and only the single figure is printed.
runctx(command, globals, locals[, filename])~
This function is similar to run, with added arguments to supply the
globals and locals dictionaries for the {command} string.
Analysis of the profiler data is done using the Stats class.
.. note::
The Stats class is defined in the pstats (|py2stdlib-pstats|) module.
==============================================================================
*py2stdlib-cstringio*
cStringIO~
:synopsis: Faster version of StringIO, but not subclassable.
The module cStringIO (|py2stdlib-cstringio|) provides an interface similar to that of the
StringIO (|py2stdlib-stringio|) module. Heavy use of StringIO.StringIO objects can be
made more efficient by using the function StringIO (|py2stdlib-stringio|) from this module
instead.
StringIO([s])~
Return a StringIO-like stream for reading or writing.
Since this is a factory function which returns objects of built-in types,
there's no way to build your own version using subclassing. It's not
possible to set attributes on it. Use the original StringIO (|py2stdlib-stringio|) module in
those cases.
Unlike the StringIO (|py2stdlib-stringio|) module, this module is not able to accept Unicode
strings that cannot be encoded as plain ASCII strings. Calling
StringIO (|py2stdlib-stringio|) with a Unicode string parameter populates the object with
the buffer representation of the Unicode string instead of encoding the
string.
Another difference from the StringIO (|py2stdlib-stringio|) module is that calling
StringIO (|py2stdlib-stringio|) with a string parameter creates a read-only object. Unlike an
object created without a string parameter, it does not have write methods.
These objects are not generally visible. They turn up in tracebacks as
StringI and StringO.
The following data objects are provided as well:
InputType~
The type object of the objects created by calling StringIO (|py2stdlib-stringio|) with a string
parameter.
OutputType~
The type object of the objects returned by calling StringIO (|py2stdlib-stringio|) with no
parameters.
There is a C API to the module as well; refer to the module source for more
information.
Example usage:: >
import cStringIO
output = cStringIO.StringIO()
output.write('First line.\n')
print >>output, 'Second line.'
# Retrieve file contents -- this will be
# 'First line.\nSecond line.\n'
contents = output.getvalue()
# Close object and discard memory buffer --
# .getvalue() will now raise an exception.
output.close()
==============================================================================
*py2stdlib-cfmfile*
cfmfile~
:platform: Mac
:synopsis: Code Fragment Resource module.
:deprecated:
cfmfile (|py2stdlib-cfmfile|) is a module that understands Code Fragments and the accompanying
"cfrg" resources. It can parse them and merge them, and is used by
BuildApplication to combine all plugin modules to a single executable.
2.4~
==============================================================================
*py2stdlib-datetime*
datetime~
:synopsis: Basic date and time types.
.. XXX what order should the types be discussed in?
.. versionadded:: 2.3
The datetime (|py2stdlib-datetime|) module supplies classes for manipulating dates and times in
both simple and complex ways. While date and time arithmetic is supported, the
focus of the implementation is on efficient member extraction for output
formatting and manipulation. For related
functionality, see also the time (|py2stdlib-time|) and calendar (|py2stdlib-calendar|) modules.
There are two kinds of date and time objects: "naive" and "aware". This
distinction refers to whether the object has any notion of time zone, daylight
saving time, or other kind of algorithmic or political time adjustment. Whether
a naive datetime (|py2stdlib-datetime|) object represents Coordinated Universal Time (UTC),
local time, or time in some other timezone is purely up to the program, just
like it's up to the program whether a particular number represents metres,
miles, or mass. Naive datetime (|py2stdlib-datetime|) objects are easy to understand and to
work with, at the cost of ignoring some aspects of reality.
For applications requiring more, datetime (|py2stdlib-datetime|) and time (|py2stdlib-time|) objects
have an optional time zone information member, tzinfo, that can contain
an instance of a subclass of the abstract tzinfo class. These
tzinfo objects capture information about the offset from UTC time, the
time zone name, and whether Daylight Saving Time is in effect. Note that no
concrete tzinfo classes are supplied by the datetime (|py2stdlib-datetime|) module.
Supporting timezones at whatever level of detail is required is up to the
application. The rules for time adjustment across the world are more political
than rational, and there is no standard suitable for every application.
The datetime (|py2stdlib-datetime|) module exports the following constants:
MINYEAR~
The smallest year number allowed in a date or datetime (|py2stdlib-datetime|) object.
MINYEAR is ``1``.
MAXYEAR~
The largest year number allowed in a date or datetime (|py2stdlib-datetime|) object.
MAXYEAR is ``9999``.
.. seealso::
Module calendar (|py2stdlib-calendar|)
General calendar related functions.
Module time (|py2stdlib-time|)
Time access and conversions.
Available Types
---------------
date~
An idealized naive date, assuming the current Gregorian calendar always was, and
always will be, in effect. Attributes: year, month, and
day.
time~
An idealized time, independent of any particular day, assuming that every day
has exactly 24\{60\}60 seconds (there is no notion of "leap seconds" here).
Attributes: hour, minute, second, microsecond,
and tzinfo.
datetime~
A combination of a date and a time. Attributes: year, month,
day, hour, minute, second, microsecond,
and tzinfo.
timedelta~
A duration expressing the difference between two date, time (|py2stdlib-time|),
or datetime (|py2stdlib-datetime|) instances to microsecond resolution.
tzinfo~
An abstract base class for time zone information objects. These are used by the
datetime (|py2stdlib-datetime|) and time (|py2stdlib-time|) classes to provide a customizable notion of
time adjustment (for example, to account for time zone and/or daylight saving
time).
Objects of these types are immutable.
Objects of the date type are always naive.
An object {d} of type time (|py2stdlib-time|) or datetime (|py2stdlib-datetime|) may be naive or aware.
{d} is aware if ``d.tzinfo`` is not ``None`` and ``d.tzinfo.utcoffset(d)`` does
not return ``None``. If ``d.tzinfo`` is ``None``, or if ``d.tzinfo`` is not
``None`` but ``d.tzinfo.utcoffset(d)`` returns ``None``, {d} is naive.
The distinction between naive and aware doesn't apply to timedelta
objects.
Subclass relationships:: >
object
timedelta
tzinfo
time
date
datetime
<
timedelta Objects
A timedelta object represents a duration, the difference between two
dates or times.
timedelta([days[, seconds[, microseconds[, milliseconds[, minutes[, hours[, weeks]]]]]]])~
All arguments are optional and default to ``0``. Arguments may be ints, longs,
or floats, and may be positive or negative.
Only {days}, {seconds} and {microseconds} are stored internally. Arguments are
converted to those units:
* A millisecond is converted to 1000 microseconds.
* A minute is converted to 60 seconds.
* An hour is converted to 3600 seconds.
* A week is converted to 7 days.
and days, seconds and microseconds are then normalized so that the
representation is unique, with
* ``0 <= microseconds < 1000000``
{ ``0 <= seconds < 3600}24`` (the number of seconds in one day)
* ``-999999999 <= days <= 999999999``
If any argument is a float and there are fractional microseconds, the fractional
microseconds left over from all arguments are combined and their sum is rounded
to the nearest microsecond. If no argument is a float, the conversion and
normalization processes are exact (no information is lost).
If the normalized value of days lies outside the indicated range,
OverflowError is raised.
Note that normalization of negative values may be surprising at first. For
example,
>>> from datetime import timedelta
>>> d = timedelta(microseconds=-1)
>>> (d.days, d.seconds, d.microseconds)
(-1, 86399, 999999)
Class attributes are:
timedelta.min~
The most negative timedelta object, ``timedelta(-999999999)``.
timedelta.max~
The most positive timedelta object, ``timedelta(days=999999999,
hours=23, minutes=59, seconds=59, microseconds=999999)``.
timedelta.resolution~
The smallest possible difference between non-equal timedelta objects,
``timedelta(microseconds=1)``.
Note that, because of normalization, ``timedelta.max`` > ``-timedelta.min``.
``-timedelta.max`` is not representable as a timedelta object.
Instance attributes (read-only):
+------------------+--------------------------------------------+
| Attribute | Value |
+==================+============================================+
| ``days`` | Between -999999999 and 999999999 inclusive |
+------------------+--------------------------------------------+
| ``seconds`` | Between 0 and 86399 inclusive |
+------------------+--------------------------------------------+
| ``microseconds`` | Between 0 and 999999 inclusive |
+------------------+--------------------------------------------+
Supported operations:
.. XXX this table is too wide!
+--------------------------------+-----------------------------------------------+
| Operation | Result |
+================================+===============================================+
| ``t1 = t2 + t3`` | Sum of {t2} and {t3}. Afterwards {t1}-{t2} == |
| | {t3} and {t1}-{t3} == {t2} are true. (1) |
+--------------------------------+-----------------------------------------------+
| ``t1 = t2 - t3`` | Difference of {t2} and {t3}. Afterwards {t1} |
| | == {t2} - {t3} and {t2} == {t1} + {t3} are |
| | true. (1) |
+--------------------------------+-----------------------------------------------+
| ``t1 = t2 { i or t1 = i } t2`` | Delta multiplied by an integer or long. |
| | Afterwards {t1} // i == {t2} is true, |
| | provided ``i != 0``. |
+--------------------------------+-----------------------------------------------+
| | In general, {t1} \{ i == }t1{ \} (i-1) + {t1} |
| | is true. (1) |
+--------------------------------+-----------------------------------------------+
| ``t1 = t2 // i`` | The floor is computed and the remainder (if |
| | any) is thrown away. (3) |
+--------------------------------+-----------------------------------------------+
| ``+t1`` | Returns a timedelta object with the |
| | same value. (2) |
+--------------------------------+-----------------------------------------------+
| ``-t1`` | equivalent to timedelta\ |
| | (-{t1.days}, -{t1.seconds}, |
| | -{t1.microseconds}), and to {t1}\* -1. (1)(4) |
+--------------------------------+-----------------------------------------------+
| ``abs(t)`` | equivalent to +\ {t} when ``t.days >= 0``, and|
| | to -{t} when ``t.days < 0``. (2) |
+--------------------------------+-----------------------------------------------+
Notes:
(1)
This is exact, but may overflow.
(2)
This is exact, and cannot overflow.
(3)
Division by 0 raises ZeroDivisionError.
(4)
-{timedelta.max} is not representable as a timedelta object.
In addition to the operations listed above timedelta objects support
certain additions and subtractions with date and datetime (|py2stdlib-datetime|)
objects (see below).
Comparisons of timedelta objects are supported with the
timedelta object representing the smaller duration considered to be the
smaller timedelta. In order to stop mixed-type comparisons from falling back to
the default comparison by object address, when a timedelta object is
compared to an object of a different type, TypeError is raised unless the
comparison is ``==`` or ``!=``. The latter cases return False or
True, respectively.
timedelta objects are hashable (usable as dictionary keys), support
efficient pickling, and in Boolean contexts, a timedelta object is
considered to be true if and only if it isn't equal to ``timedelta(0)``.
Instance methods:
timedelta.total_seconds()~
Return the total number of seconds contained in the duration.
Equivalent to ``(td.microseconds + (td.seconds + td.days { 24 }
3600) { 10}{6) / 10}*6`` computed with true division enabled.
Note that for very large time intervals (greater than 270 years on
most platforms) this method will lose microsecond accuracy.
.. versionadded:: 2.7
Example usage:
>>> from datetime import timedelta
>>> year = timedelta(days=365)
>>> another_year = timedelta(weeks=40, days=84, hours=23,
... minutes=50, seconds=600) # adds up to 365 days
>>> year.total_seconds()
31536000.0
>>> year == another_year
True
>>> ten_years = 10 * year
>>> ten_years, ten_years.days // 365
(datetime.timedelta(3650), 10)
>>> nine_years = ten_years - year
>>> nine_years, nine_years.days // 365
(datetime.timedelta(3285), 9)
>>> three_years = nine_years // 3;
>>> three_years, three_years.days // 365
(datetime.timedelta(1095), 3)
>>> abs(three_years - ten_years) == 2 * three_years + year
True
date Objects
---------------------
A date object represents a date (year, month and day) in an idealized
calendar, the current Gregorian calendar indefinitely extended in both
directions. January 1 of year 1 is called day number 1, January 2 of year 1 is
called day number 2, and so on. This matches the definition of the "proleptic
Gregorian" calendar in Dershowitz and Reingold's book Calendrical Calculations,
where it's the base calendar for all computations. See the book for algorithms
for converting between proleptic Gregorian ordinals and many other calendar
systems.
date(year, month, day)~
All arguments are required. Arguments may be ints or longs, in the following
ranges:
* ``MINYEAR <= year <= MAXYEAR``
* ``1 <= month <= 12``
* ``1 <= day <= number of days in the given month and year``
If an argument outside those ranges is given, ValueError is raised.
Other constructors, all class methods:
.. classmethod:: date.today()
Return the current local date. This is equivalent to
``date.fromtimestamp(time.time())``.
.. classmethod:: date.fromtimestamp(timestamp)
Return the local date corresponding to the POSIX timestamp, such as is returned
by time.time. This may raise ValueError, if the timestamp is out
of the range of values supported by the platform C localtime function.
It's common for this to be restricted to years from 1970 through 2038. Note
that on non-POSIX systems that include leap seconds in their notion of a
timestamp, leap seconds are ignored by fromtimestamp.
.. classmethod:: date.fromordinal(ordinal)
Return the date corresponding to the proleptic Gregorian ordinal, where January
1 of year 1 has ordinal 1. ValueError is raised unless ``1 <= ordinal <=
date.max.toordinal()``. For any date {d}, ``date.fromordinal(d.toordinal()) ==
d``.
Class attributes:
date.min~
The earliest representable date, ``date(MINYEAR, 1, 1)``.
date.max~
The latest representable date, ``date(MAXYEAR, 12, 31)``.
date.resolution~
The smallest possible difference between non-equal date objects,
``timedelta(days=1)``.
Instance attributes (read-only):
date.year~
Between MINYEAR and MAXYEAR inclusive.
date.month~
Between 1 and 12 inclusive.
date.day~
Between 1 and the number of days in the given month of the given year.
Supported operations:
+-------------------------------+----------------------------------------------+
| Operation | Result |
+===============================+==============================================+
| ``date2 = date1 + timedelta`` | {date2} is ``timedelta.days`` days removed |
| | from {date1}. (1) |
+-------------------------------+----------------------------------------------+
| ``date2 = date1 - timedelta`` | Computes {date2} such that ``date2 + |
| | timedelta == date1``. (2) |
+-------------------------------+----------------------------------------------+
| ``timedelta = date1 - date2`` | \(3) |
+-------------------------------+----------------------------------------------+
| ``date1 < date2`` | {date1} is considered less than {date2} when |
| | {date1} precedes {date2} in time. (4) |
+-------------------------------+----------------------------------------------+
Notes:
(1)
{date2} is moved forward in time if ``timedelta.days > 0``, or backward if
``timedelta.days < 0``. Afterward ``date2 - date1 == timedelta.days``.
``timedelta.seconds`` and ``timedelta.microseconds`` are ignored.
OverflowError is raised if ``date2.year`` would be smaller than
MINYEAR or larger than MAXYEAR.
(2)
This isn't quite equivalent to date1 + (-timedelta), because -timedelta in
isolation can overflow in cases where date1 - timedelta does not.
``timedelta.seconds`` and ``timedelta.microseconds`` are ignored.
(3)
This is exact, and cannot overflow. timedelta.seconds and
timedelta.microseconds are 0, and date2 + timedelta == date1 after.
(4)
In other words, ``date1 < date2`` if and only if ``date1.toordinal() <
date2.toordinal()``. In order to stop comparison from falling back to the
default scheme of comparing object addresses, date comparison normally raises
TypeError if the other comparand isn't also a date object.
However, ``NotImplemented`` is returned instead if the other comparand has a
timetuple attribute. This hook gives other kinds of date objects a
chance at implementing mixed-type comparison. If not, when a date
object is compared to an object of a different type, TypeError is raised
unless the comparison is ``==`` or ``!=``. The latter cases return
False or True, respectively.
Dates can be used as dictionary keys. In Boolean contexts, all date
objects are considered to be true.
Instance methods:
date.replace(year, month, day)~
Return a date with the same value, except for those members given new values by
whichever keyword arguments are specified. For example, if ``d == date(2002,
12, 31)``, then ``d.replace(day=26) == date(2002, 12, 26)``.
date.timetuple()~
Return a time.struct_time such as returned by time.localtime.
The hours, minutes and seconds are 0, and the DST flag is -1. ``d.timetuple()``
is equivalent to ``time.struct_time((d.year, d.month, d.day, 0, 0, 0,
d.weekday(), yday, -1))``, where ``yday = d.toordinal() - date(d.year, 1,
1).toordinal() + 1`` is the day number within the current year starting with
``1`` for January 1st.
date.toordinal()~
Return the proleptic Gregorian ordinal of the date, where January 1 of year 1
has ordinal 1. For any date object {d},
``date.fromordinal(d.toordinal()) == d``.
date.weekday()~
Return the day of the week as an integer, where Monday is 0 and Sunday is 6.
For example, ``date(2002, 12, 4).weekday() == 2``, a Wednesday. See also
isoweekday.
date.isoweekday()~
Return the day of the week as an integer, where Monday is 1 and Sunday is 7.
For example, ``date(2002, 12, 4).isoweekday() == 3``, a Wednesday. See also
weekday, isocalendar.
date.isocalendar()~
Return a 3-tuple, (ISO year, ISO week number, ISO weekday).
The ISO calendar is a widely used variant of the Gregorian calendar. See
http://www.phys.uu.nl/~vgent/calendar/isocalendar.htm for a good
explanation.
The ISO year consists of 52 or 53 full weeks, and where a week starts on a
Monday and ends on a Sunday. The first week of an ISO year is the first
(Gregorian) calendar week of a year containing a Thursday. This is called week
number 1, and the ISO year of that Thursday is the same as its Gregorian year.
For example, 2004 begins on a Thursday, so the first week of ISO year 2004
begins on Monday, 29 Dec 2003 and ends on Sunday, 4 Jan 2004, so that
``date(2003, 12, 29).isocalendar() == (2004, 1, 1)`` and ``date(2004, 1,
4).isocalendar() == (2004, 1, 7)``.
date.isoformat()~
Return a string representing the date in ISO 8601 format, 'YYYY-MM-DD'. For
example, ``date(2002, 12, 4).isoformat() == '2002-12-04'``.
date.__str__()~
For a date {d}, ``str(d)`` is equivalent to ``d.isoformat()``.
date.ctime()~
Return a string representing the date, for example ``date(2002, 12,
4).ctime() == 'Wed Dec 4 00:00:00 2002'``. ``d.ctime()`` is equivalent to
``time.ctime(time.mktime(d.timetuple()))`` on platforms where the native C
ctime function (which time.ctime invokes, but which
date.ctime does not invoke) conforms to the C standard.
date.strftime(format)~
Return a string representing the date, controlled by an explicit format string.
Format codes referring to hours, minutes or seconds will see 0 values. See
section strftime-strptime-behavior.
Example of counting days to an event:: >
>>> import time
>>> from datetime import date
>>> today = date.today()
>>> today
datetime.date(2007, 12, 5)
>>> today == date.fromtimestamp(time.time())
True
>>> my_birthday = date(today.year, 6, 24)
>>> if my_birthday < today:
... my_birthday = my_birthday.replace(year=today.year + 1)
>>> my_birthday
datetime.date(2008, 6, 24)
>>> time_to_birthday = abs(my_birthday - today)
>>> time_to_birthday.days
202
<
Example of working with date:
.. doctest::
>>> from datetime import date
>>> d = date.fromordinal(730920) # 730920th day after 1. 1. 0001
>>> d
datetime.date(2002, 3, 11)
>>> t = d.timetuple()
>>> for i in t: # doctest: +SKIP
... print i
2002 # year
3 # month
11 # day
0
0
0
0 # weekday (0 = Monday)
70 # 70th day in the year
-1
>>> ic = d.isocalendar()
>>> for i in ic: # doctest: +SKIP
... print i
2002 # ISO year
11 # ISO week number
1 # ISO day number ( 1 = Monday )
>>> d.isoformat()
'2002-03-11'
>>> d.strftime("%d/%m/%y")
'11/03/02'
>>> d.strftime("%A %d. %B %Y")
'Monday 11. March 2002'
datetime (|py2stdlib-datetime|) Objects
-------------------------
A datetime (|py2stdlib-datetime|) object is a single object containing all the information
from a date object and a time (|py2stdlib-time|) object. Like a date
object, datetime (|py2stdlib-datetime|) assumes the current Gregorian calendar extended in
both directions; like a time object, datetime (|py2stdlib-datetime|) assumes there are exactly
3600\*24 seconds in every day.
Constructor:
datetime(year, month, day[, hour[, minute[, second[, microsecond[, tzinfo]]]]])~
The year, month and day arguments are required. {tzinfo} may be ``None``, or an
instance of a tzinfo subclass. The remaining arguments may be ints or
longs, in the following ranges:
* ``MINYEAR <= year <= MAXYEAR``
* ``1 <= month <= 12``
* ``1 <= day <= number of days in the given month and year``
* ``0 <= hour < 24``
* ``0 <= minute < 60``
* ``0 <= second < 60``
* ``0 <= microsecond < 1000000``
If an argument outside those ranges is given, ValueError is raised.
Other constructors, all class methods:
.. classmethod:: datetime.today()
Return the current local datetime, with tzinfo ``None``. This is
equivalent to ``datetime.fromtimestamp(time.time())``. See also now,
fromtimestamp.
.. classmethod:: datetime.now([tz])
Return the current local date and time. If optional argument {tz} is ``None``
or not specified, this is like today, but, if possible, supplies more
precision than can be gotten from going through a time.time timestamp
(for example, this may be possible on platforms supplying the C
gettimeofday function).
Else {tz} must be an instance of a class tzinfo subclass, and the
current date and time are converted to {tz}'s time zone. In this case the
result is equivalent to ``tz.fromutc(datetime.utcnow().replace(tzinfo=tz))``.
See also today, utcnow.
.. classmethod:: datetime.utcnow()
Return the current UTC date and time, with tzinfo ``None``. This is like
now, but returns the current UTC date and time, as a naive
datetime (|py2stdlib-datetime|) object. See also now.
.. classmethod:: datetime.fromtimestamp(timestamp[, tz])
Return the local date and time corresponding to the POSIX timestamp, such as is
returned by time.time. If optional argument {tz} is ``None`` or not
specified, the timestamp is converted to the platform's local date and time, and
the returned datetime (|py2stdlib-datetime|) object is naive.
Else {tz} must be an instance of a class tzinfo subclass, and the
timestamp is converted to {tz}'s time zone. In this case the result is
equivalent to
``tz.fromutc(datetime.utcfromtimestamp(timestamp).replace(tzinfo=tz))``.
fromtimestamp may raise ValueError, if the timestamp is out of
the range of values supported by the platform C localtime or
gmtime functions. It's common for this to be restricted to years in
1970 through 2038. Note that on non-POSIX systems that include leap seconds in
their notion of a timestamp, leap seconds are ignored by fromtimestamp,
and then it's possible to have two timestamps differing by a second that yield
identical datetime (|py2stdlib-datetime|) objects. See also utcfromtimestamp.
.. classmethod:: datetime.utcfromtimestamp(timestamp)
Return the UTC datetime (|py2stdlib-datetime|) corresponding to the POSIX timestamp, with
tzinfo ``None``. This may raise ValueError, if the timestamp is
out of the range of values supported by the platform C gmtime function.
It's common for this to be restricted to years in 1970 through 2038. See also
fromtimestamp.
.. classmethod:: datetime.fromordinal(ordinal)
Return the datetime (|py2stdlib-datetime|) corresponding to the proleptic Gregorian ordinal,
where January 1 of year 1 has ordinal 1. ValueError is raised unless ``1
<= ordinal <= datetime.max.toordinal()``. The hour, minute, second and
microsecond of the result are all 0, and tzinfo is ``None``.
.. classmethod:: datetime.combine(date, time)
Return a new datetime (|py2stdlib-datetime|) object whose date members are equal to the given
date object's, and whose time and tzinfo members are equal to
the given time (|py2stdlib-time|) object's. For any datetime (|py2stdlib-datetime|) object {d}, ``d ==
datetime.combine(d.date(), d.timetz())``. If date is a datetime (|py2stdlib-datetime|)
object, its time and tzinfo members are ignored.
.. classmethod:: datetime.strptime(date_string, format)
Return a datetime (|py2stdlib-datetime|) corresponding to {date_string}, parsed according to
{format}. This is equivalent to ``datetime(*(time.strptime(date_string,
format)[0:6]))``. ValueError is raised if the date_string and format
can't be parsed by time.strptime or if it returns a value which isn't a
time tuple. See section strftime-strptime-behavior.
.. versionadded:: 2.5
Class attributes:
datetime.min~
The earliest representable datetime (|py2stdlib-datetime|), ``datetime(MINYEAR, 1, 1,
tzinfo=None)``.
datetime.max~
The latest representable datetime (|py2stdlib-datetime|), ``datetime(MAXYEAR, 12, 31, 23, 59,
59, 999999, tzinfo=None)``.
datetime.resolution~
The smallest possible difference between non-equal datetime (|py2stdlib-datetime|) objects,
``timedelta(microseconds=1)``.
Instance attributes (read-only):
datetime.year~
Between MINYEAR and MAXYEAR inclusive.
datetime.month~
Between 1 and 12 inclusive.
datetime.day~
Between 1 and the number of days in the given month of the given year.
datetime.hour~
In ``range(24)``.
datetime.minute~
In ``range(60)``.
datetime.second~
In ``range(60)``.
datetime.microsecond~
In ``range(1000000)``.
datetime.tzinfo~
The object passed as the {tzinfo} argument to the datetime (|py2stdlib-datetime|) constructor,
or ``None`` if none was passed.
Supported operations:
+---------------------------------------+-------------------------------+
| Operation | Result |
+=======================================+===============================+
| ``datetime2 = datetime1 + timedelta`` | \(1) |
+---------------------------------------+-------------------------------+
| ``datetime2 = datetime1 - timedelta`` | \(2) |
+---------------------------------------+-------------------------------+
| ``timedelta = datetime1 - datetime2`` | \(3) |
+---------------------------------------+-------------------------------+
| ``datetime1 < datetime2`` | Compares datetime (|py2stdlib-datetime|) to |
| | datetime (|py2stdlib-datetime|). (4) |
+---------------------------------------+-------------------------------+
(1)
datetime2 is a duration of timedelta removed from datetime1, moving forward in
time if ``timedelta.days`` > 0, or backward if ``timedelta.days`` < 0. The
result has the same tzinfo member as the input datetime, and datetime2 -
datetime1 == timedelta after. OverflowError is raised if datetime2.year
would be smaller than MINYEAR or larger than MAXYEAR. Note
that no time zone adjustments are done even if the input is an aware object.
(2)
Computes the datetime2 such that datetime2 + timedelta == datetime1. As for
addition, the result has the same tzinfo member as the input datetime,
and no time zone adjustments are done even if the input is aware. This isn't
quite equivalent to datetime1 + (-timedelta), because -timedelta in isolation
can overflow in cases where datetime1 - timedelta does not.
(3)
Subtraction of a datetime (|py2stdlib-datetime|) from a datetime (|py2stdlib-datetime|) is defined only if
both operands are naive, or if both are aware. If one is aware and the other is
naive, TypeError is raised.
If both are naive, or both are aware and have the same tzinfo member,
the tzinfo members are ignored, and the result is a timedelta
object {t} such that ``datetime2 + t == datetime1``. No time zone adjustments
are done in this case.
If both are aware and have different tzinfo members, ``a-b`` acts as if
{a} and {b} were first converted to naive UTC datetimes first. The result is
``(a.replace(tzinfo=None) - a.utcoffset()) - (b.replace(tzinfo=None) -
b.utcoffset())`` except that the implementation never overflows.
(4)
{datetime1} is considered less than {datetime2} when {datetime1} precedes
{datetime2} in time.
If one comparand is naive and the other is aware, TypeError is raised.
If both comparands are aware, and have the same tzinfo member, the
common tzinfo member is ignored and the base datetimes are compared. If
both comparands are aware and have different tzinfo members, the
comparands are first adjusted by subtracting their UTC offsets (obtained from
``self.utcoffset()``).
.. note:: >
In order to stop comparison from falling back to the default scheme of comparing
object addresses, datetime comparison normally raises TypeError if the
other comparand isn't also a datetime (|py2stdlib-datetime|) object. However,
``NotImplemented`` is returned instead if the other comparand has a
timetuple attribute. This hook gives other kinds of date objects a
chance at implementing mixed-type comparison. If not, when a datetime (|py2stdlib-datetime|)
object is compared to an object of a different type, TypeError is raised
unless the comparison is ``==`` or ``!=``. The latter cases return
False or True, respectively.
<
datetime (|py2stdlib-datetime|) objects can be used as dictionary keys. In Boolean contexts,
all datetime (|py2stdlib-datetime|) objects are considered to be true.
Instance methods:
datetime.date()~
Return date object with same year, month and day.
datetime.time()~
Return time (|py2stdlib-time|) object with same hour, minute, second and microsecond.
tzinfo is ``None``. See also method timetz.
datetime.timetz()~
Return time (|py2stdlib-time|) object with same hour, minute, second, microsecond, and
tzinfo members. See also method time (|py2stdlib-time|).
datetime.replace([year[, month[, day[, hour[, minute[, second[, microsecond[, tzinfo]]]]]]]])~
Return a datetime with the same members, except for those members given new
values by whichever keyword arguments are specified. Note that ``tzinfo=None``
can be specified to create a naive datetime from an aware datetime with no
conversion of date and time members.
datetime.astimezone(tz)~
Return a datetime (|py2stdlib-datetime|) object with new tzinfo member {tz}, adjusting
the date and time members so the result is the same UTC time as {self}, but in
{tz}'s local time.
{tz} must be an instance of a tzinfo subclass, and its
utcoffset and dst methods must not return ``None``. {self} must
be aware (``self.tzinfo`` must not be ``None``, and ``self.utcoffset()`` must
not return ``None``).
If ``self.tzinfo`` is {tz}, ``self.astimezone(tz)`` is equal to {self}: no
adjustment of date or time members is performed. Else the result is local time
in time zone {tz}, representing the same UTC time as {self}: after ``astz =
dt.astimezone(tz)``, ``astz - astz.utcoffset()`` will usually have the same date
and time members as ``dt - dt.utcoffset()``. The discussion of class
tzinfo explains the cases at Daylight Saving Time transition boundaries
where this cannot be achieved (an issue only if {tz} models both standard and
daylight time).
If you merely want to attach a time zone object {tz} to a datetime {dt} without
adjustment of date and time members, use ``dt.replace(tzinfo=tz)``. If you
merely want to remove the time zone object from an aware datetime {dt} without
conversion of date and time members, use ``dt.replace(tzinfo=None)``.
Note that the default tzinfo.fromutc method can be overridden in a
tzinfo subclass to affect the result returned by astimezone.
Ignoring error cases, astimezone acts like:: >
def astimezone(self, tz):
if self.tzinfo is tz:
return self
# Convert self to UTC, and attach the new time zone object.
utc = (self - self.utcoffset()).replace(tzinfo=tz)
# Convert from UTC to tz's local time.
return tz.fromutc(utc)
<
datetime.utcoffset()~
If tzinfo is ``None``, returns ``None``, else returns
``self.tzinfo.utcoffset(self)``, and raises an exception if the latter doesn't
return ``None``, or a timedelta object representing a whole number of
minutes with magnitude less than one day.
datetime.dst()~
If tzinfo is ``None``, returns ``None``, else returns
``self.tzinfo.dst(self)``, and raises an exception if the latter doesn't return
``None``, or a timedelta object representing a whole number of minutes
with magnitude less than one day.
datetime.tzname()~
If tzinfo is ``None``, returns ``None``, else returns
``self.tzinfo.tzname(self)``, raises an exception if the latter doesn't return
``None`` or a string object,
datetime.timetuple()~
Return a time.struct_time such as returned by time.localtime.
``d.timetuple()`` is equivalent to ``time.struct_time((d.year, d.month, d.day,
d.hour, d.minute, d.second, d.weekday(), yday, dst))``, where ``yday =
d.toordinal() - date(d.year, 1, 1).toordinal() + 1`` is the day number within
the current year starting with ``1`` for January 1st. The tm_isdst flag
of the result is set according to the dst method: tzinfo is
``None`` or dst` returns ``None``, tm_isdst is set to ``-1``;
else if dst returns a non-zero value, tm_isdst is set to ``1``;
else tm_isdst is set to ``0``.
datetime.utctimetuple()~
If datetime (|py2stdlib-datetime|) instance {d} is naive, this is the same as
``d.timetuple()`` except that tm_isdst is forced to 0 regardless of what
``d.dst()`` returns. DST is never in effect for a UTC time.
If {d} is aware, {d} is normalized to UTC time, by subtracting
``d.utcoffset()``, and a time.struct_time for the normalized time is
returned. tm_isdst is forced to 0. Note that the result's
tm_year member may be MINYEAR\ -1 or MAXYEAR\ +1, if
{d}.year was ``MINYEAR`` or ``MAXYEAR`` and UTC adjustment spills over a year
boundary.
datetime.toordinal()~
Return the proleptic Gregorian ordinal of the date. The same as
``self.date().toordinal()``.
datetime.weekday()~
Return the day of the week as an integer, where Monday is 0 and Sunday is 6.
The same as ``self.date().weekday()``. See also isoweekday.
datetime.isoweekday()~
Return the day of the week as an integer, where Monday is 1 and Sunday is 7.
The same as ``self.date().isoweekday()``. See also weekday,
isocalendar.
datetime.isocalendar()~
Return a 3-tuple, (ISO year, ISO week number, ISO weekday). The same as
``self.date().isocalendar()``.
datetime.isoformat([sep])~
Return a string representing the date and time in ISO 8601 format,
YYYY-MM-DDTHH:MM:SS.mmmmmm or, if microsecond is 0,
YYYY-MM-DDTHH:MM:SS
If utcoffset does not return ``None``, a 6-character string is
appended, giving the UTC offset in (signed) hours and minutes:
YYYY-MM-DDTHH:MM:SS.mmmmmm+HH:MM or, if microsecond is 0
YYYY-MM-DDTHH:MM:SS+HH:MM
The optional argument {sep} (default ``'T'``) is a one-character separator,
placed between the date and time portions of the result. For example,
>>> from datetime import tzinfo, timedelta, datetime
>>> class TZ(tzinfo):
... def utcoffset(self, dt): return timedelta(minutes=-399)
...
>>> datetime(2002, 12, 25, tzinfo=TZ()).isoformat(' ')
'2002-12-25 00:00:00-06:39'
datetime.__str__()~
For a datetime (|py2stdlib-datetime|) instance {d}, ``str(d)`` is equivalent to
``d.isoformat(' ')``.
datetime.ctime()~
Return a string representing the date and time, for example ``datetime(2002, 12,
4, 20, 30, 40).ctime() == 'Wed Dec 4 20:30:40 2002'``. ``d.ctime()`` is
equivalent to ``time.ctime(time.mktime(d.timetuple()))`` on platforms where the
native C ctime function (which time.ctime invokes, but which
datetime.ctime does not invoke) conforms to the C standard.
datetime.strftime(format)~
Return a string representing the date and time, controlled by an explicit format
string. See section strftime-strptime-behavior.
Examples of working with datetime objects:
.. doctest::
>>> from datetime import datetime, date, time
>>> # Using datetime.combine()
>>> d = date(2005, 7, 14)
>>> t = time(12, 30)
>>> datetime.combine(d, t)
datetime.datetime(2005, 7, 14, 12, 30)
>>> # Using datetime.now() or datetime.utcnow()
>>> datetime.now() # doctest: +SKIP
datetime.datetime(2007, 12, 6, 16, 29, 43, 79043) # GMT +1
>>> datetime.utcnow() # doctest: +SKIP
datetime.datetime(2007, 12, 6, 15, 29, 43, 79060)
>>> # Using datetime.strptime()
>>> dt = datetime.strptime("21/11/06 16:30", "%d/%m/%y %H:%M")
>>> dt
datetime.datetime(2006, 11, 21, 16, 30)
>>> # Using datetime.timetuple() to get tuple of all attributes
>>> tt = dt.timetuple()
>>> for it in tt: # doctest: +SKIP
... print it
...
2006 # year
11 # month
21 # day
16 # hour
30 # minute
0 # second
1 # weekday (0 = Monday)
325 # number of days since 1st January
-1 # dst - method tzinfo.dst() returned None
>>> # Date in ISO format
>>> ic = dt.isocalendar()
>>> for it in ic: # doctest: +SKIP
... print it
...
2006 # ISO year
47 # ISO week
2 # ISO weekday
>>> # Formatting datetime
>>> dt.strftime("%A, %d. %B %Y %I:%M%p")
'Tuesday, 21. November 2006 04:30PM'
Using datetime with tzinfo:
>>> from datetime import timedelta, datetime, tzinfo
>>> class GMT1(tzinfo):
... def __init__(self): # DST starts last Sunday in March
... d = datetime(dt.year, 4, 1) # ends last Sunday in October
... self.dston = d - timedelta(days=d.weekday() + 1)
... d = datetime(dt.year, 11, 1)
... self.dstoff = d - timedelta(days=d.weekday() + 1)
... def utcoffset(self, dt):
... return timedelta(hours=1) + self.dst(dt)
... def dst(self, dt):
... if self.dston <= dt.replace(tzinfo=None) < self.dstoff:
... return timedelta(hours=1)
... else:
... return timedelta(0)
... def tzname(self,dt):
... return "GMT +1"
...
>>> class GMT2(tzinfo):
... def __init__(self):
... d = datetime(dt.year, 4, 1)
... self.dston = d - timedelta(days=d.weekday() + 1)
... d = datetime(dt.year, 11, 1)
... self.dstoff = d - timedelta(days=d.weekday() + 1)
... def utcoffset(self, dt):
... return timedelta(hours=1) + self.dst(dt)
... def dst(self, dt):
... if self.dston <= dt.replace(tzinfo=None) < self.dstoff:
... return timedelta(hours=2)
... else:
... return timedelta(0)
... def tzname(self,dt):
... return "GMT +2"
...
>>> gmt1 = GMT1()
>>> # Daylight Saving Time
>>> dt1 = datetime(2006, 11, 21, 16, 30, tzinfo=gmt1)
>>> dt1.dst()
datetime.timedelta(0)
>>> dt1.utcoffset()
datetime.timedelta(0, 3600)
>>> dt2 = datetime(2006, 6, 14, 13, 0, tzinfo=gmt1)
>>> dt2.dst()
datetime.timedelta(0, 3600)
>>> dt2.utcoffset()
datetime.timedelta(0, 7200)
>>> # Convert datetime to another time zone
>>> dt3 = dt2.astimezone(GMT2())
>>> dt3 # doctest: +ELLIPSIS
datetime.datetime(2006, 6, 14, 14, 0, tzinfo=<GMT2 object at 0x...>)
>>> dt2 # doctest: +ELLIPSIS
datetime.datetime(2006, 6, 14, 13, 0, tzinfo=<GMT1 object at 0x...>)
>>> dt2.utctimetuple() == dt3.utctimetuple()
True
time (|py2stdlib-time|) Objects
---------------------
A time object represents a (local) time of day, independent of any particular
day, and subject to adjustment via a tzinfo object.
time(hour[, minute[, second[, microsecond[, tzinfo]]]])~
All arguments are optional. {tzinfo} may be ``None``, or an instance of a
tzinfo subclass. The remaining arguments may be ints or longs, in the
following ranges:
* ``0 <= hour < 24``
* ``0 <= minute < 60``
* ``0 <= second < 60``
* ``0 <= microsecond < 1000000``.
If an argument outside those ranges is given, ValueError is raised. All
default to ``0`` except {tzinfo}, which defaults to None.
Class attributes:
time.min~
The earliest representable time (|py2stdlib-time|), ``time(0, 0, 0, 0)``.
time.max~
The latest representable time (|py2stdlib-time|), ``time(23, 59, 59, 999999)``.
time.resolution~
The smallest possible difference between non-equal time (|py2stdlib-time|) objects,
``timedelta(microseconds=1)``, although note that arithmetic on time (|py2stdlib-time|)
objects is not supported.
Instance attributes (read-only):
time.hour~
In ``range(24)``.
time.minute~
In ``range(60)``.
time.second~
In ``range(60)``.
time.microsecond~
In ``range(1000000)``.
time.tzinfo~
The object passed as the tzinfo argument to the time (|py2stdlib-time|) constructor, or
``None`` if none was passed.
Supported operations:
{ comparison of time (|py2stdlib-time|) to time (|py2stdlib-time|), where }a* is considered less
than {b} when {a} precedes {b} in time. If one comparand is naive and the other
is aware, TypeError is raised. If both comparands are aware, and have
the same tzinfo member, the common tzinfo member is ignored and
the base times are compared. If both comparands are aware and have different
tzinfo members, the comparands are first adjusted by subtracting their
UTC offsets (obtained from ``self.utcoffset()``). In order to stop mixed-type
comparisons from falling back to the default comparison by object address, when
a time (|py2stdlib-time|) object is compared to an object of a different type,
TypeError is raised unless the comparison is ``==`` or ``!=``. The
latter cases return False or True, respectively.
* hash, use as dict key
* efficient pickling
* in Boolean contexts, a time (|py2stdlib-time|) object is considered to be true if and
only if, after converting it to minutes and subtracting utcoffset (or
``0`` if that's ``None``), the result is non-zero.
Instance methods:
time.replace([hour[, minute[, second[, microsecond[, tzinfo]]]]])~
Return a time (|py2stdlib-time|) with the same value, except for those members given new
values by whichever keyword arguments are specified. Note that ``tzinfo=None``
can be specified to create a naive time (|py2stdlib-time|) from an aware time (|py2stdlib-time|),
without conversion of the time members.
time.isoformat()~
Return a string representing the time in ISO 8601 format, HH:MM:SS.mmmmmm or, if
self.microsecond is 0, HH:MM:SS If utcoffset does not return ``None``, a
6-character string is appended, giving the UTC offset in (signed) hours and
minutes: HH:MM:SS.mmmmmm+HH:MM or, if self.microsecond is 0, HH:MM:SS+HH:MM
time.__str__()~
For a time {t}, ``str(t)`` is equivalent to ``t.isoformat()``.
time.strftime(format)~
Return a string representing the time, controlled by an explicit format string.
See section strftime-strptime-behavior.
time.utcoffset()~
If tzinfo is ``None``, returns ``None``, else returns
``self.tzinfo.utcoffset(None)``, and raises an exception if the latter doesn't
return ``None`` or a timedelta object representing a whole number of
minutes with magnitude less than one day.
time.dst()~
If tzinfo is ``None``, returns ``None``, else returns
``self.tzinfo.dst(None)``, and raises an exception if the latter doesn't return
``None``, or a timedelta object representing a whole number of minutes
with magnitude less than one day.
time.tzname()~
If tzinfo is ``None``, returns ``None``, else returns
``self.tzinfo.tzname(None)``, or raises an exception if the latter doesn't
return ``None`` or a string object.
Example:
>>> from datetime import time, tzinfo
>>> class GMT1(tzinfo):
... def utcoffset(self, dt):
... return timedelta(hours=1)
... def dst(self, dt):
... return timedelta(0)
... def tzname(self,dt):
... return "Europe/Prague"
...
>>> t = time(12, 10, 30, tzinfo=GMT1())
>>> t # doctest: +ELLIPSIS
datetime.time(12, 10, 30, tzinfo=<GMT1 object at 0x...>)
>>> gmt = GMT1()
>>> t.isoformat()
'12:10:30+01:00'
>>> t.dst()
datetime.timedelta(0)
>>> t.tzname()
'Europe/Prague'
>>> t.strftime("%H:%M:%S %Z")
'12:10:30 Europe/Prague'
tzinfo Objects
-----------------------
tzinfo is an abstract base class, meaning that this class should not be
instantiated directly. You need to derive a concrete subclass, and (at least)
supply implementations of the standard tzinfo methods needed by the
datetime (|py2stdlib-datetime|) methods you use. The datetime (|py2stdlib-datetime|) module does not supply
any concrete subclasses of tzinfo.
An instance of (a concrete subclass of) tzinfo can be passed to the
constructors for datetime (|py2stdlib-datetime|) and time (|py2stdlib-time|) objects. The latter objects
view their members as being in local time, and the tzinfo object
supports methods revealing offset of local time from UTC, the name of the time
zone, and DST offset, all relative to a date or time object passed to them.
Special requirement for pickling: A tzinfo subclass must have an
__init__ method that can be called with no arguments, else it can be
pickled but possibly not unpickled again. This is a technical requirement that
may be relaxed in the future.
A concrete subclass of tzinfo may need to implement the following
methods. Exactly which methods are needed depends on the uses made of aware
datetime (|py2stdlib-datetime|) objects. If in doubt, simply implement all of them.
tzinfo.utcoffset(self, dt)~
Return offset of local time from UTC, in minutes east of UTC. If local time is
west of UTC, this should be negative. Note that this is intended to be the
total offset from UTC; for example, if a tzinfo object represents both
time zone and DST adjustments, utcoffset should return their sum. If
the UTC offset isn't known, return ``None``. Else the value returned must be a
timedelta object specifying a whole number of minutes in the range
-1439 to 1439 inclusive (1440 = 24\*60; the magnitude of the offset must be less
than one day). Most implementations of utcoffset will probably look
like one of these two:: >
return CONSTANT # fixed-offset class
return CONSTANT + self.dst(dt) # daylight-aware class
<
If utcoffset does not return ``None``, dst should not return
``None`` either.
The default implementation of utcoffset raises
NotImplementedError.
tzinfo.dst(self, dt)~
Return the daylight saving time (DST) adjustment, in minutes east of UTC, or
``None`` if DST information isn't known. Return ``timedelta(0)`` if DST is not
in effect. If DST is in effect, return the offset as a timedelta object
(see utcoffset for details). Note that DST offset, if applicable, has
already been added to the UTC offset returned by utcoffset, so there's
no need to consult dst unless you're interested in obtaining DST info
separately. For example, datetime.timetuple calls its tzinfo
member's dst method to determine how the tm_isdst flag should be
set, and tzinfo.fromutc calls dst to account for DST changes
when crossing time zones.
An instance {tz} of a tzinfo subclass that models both standard and
daylight times must be consistent in this sense:
``tz.utcoffset(dt) - tz.dst(dt)``
must return the same result for every datetime (|py2stdlib-datetime|) {dt} with ``dt.tzinfo ==
tz`` For sane tzinfo subclasses, this expression yields the time
zone's "standard offset", which should not depend on the date or the time, but
only on geographic location. The implementation of datetime.astimezone
relies on this, but cannot detect violations; it's the programmer's
responsibility to ensure it. If a tzinfo subclass cannot guarantee
this, it may be able to override the default implementation of
tzinfo.fromutc to work correctly with astimezone regardless.
Most implementations of dst will probably look like one of these two:: >
def dst(self):
# a fixed-offset class: doesn't account for DST
return timedelta(0)
<
or ::
def dst(self):
# Code to set dston and dstoff to the time zone's DST
# transition times based on the input dt.year, and expressed
# in standard local time. Then
if dston <= dt.replace(tzinfo=None) < dstoff:
return timedelta(hours=1)
else:
return timedelta(0)
The default implementation of dst raises NotImplementedError.
tzinfo.tzname(self, dt)~
Return the time zone name corresponding to the datetime (|py2stdlib-datetime|) object {dt}, as
a string. Nothing about string names is defined by the datetime (|py2stdlib-datetime|) module,
and there's no requirement that it mean anything in particular. For example,
"GMT", "UTC", "-500", "-5:00", "EDT", "US/Eastern", "America/New York" are all
valid replies. Return ``None`` if a string name isn't known. Note that this is
a method rather than a fixed string primarily because some tzinfo
subclasses will wish to return different names depending on the specific value
of {dt} passed, especially if the tzinfo class is accounting for
daylight time.
The default implementation of tzname raises NotImplementedError.
These methods are called by a datetime (|py2stdlib-datetime|) or time (|py2stdlib-time|) object, in
response to their methods of the same names. A datetime (|py2stdlib-datetime|) object passes
itself as the argument, and a time (|py2stdlib-time|) object passes ``None`` as the
argument. A tzinfo subclass's methods should therefore be prepared to
accept a {dt} argument of ``None``, or of class datetime (|py2stdlib-datetime|).
When ``None`` is passed, it's up to the class designer to decide the best
response. For example, returning ``None`` is appropriate if the class wishes to
say that time objects don't participate in the tzinfo protocols. It
may be more useful for ``utcoffset(None)`` to return the standard UTC offset, as
there is no other convention for discovering the standard offset.
When a datetime (|py2stdlib-datetime|) object is passed in response to a datetime (|py2stdlib-datetime|)
method, ``dt.tzinfo`` is the same object as {self}. tzinfo methods can
rely on this, unless user code calls tzinfo methods directly. The
intent is that the tzinfo methods interpret {dt} as being in local
time, and not need worry about objects in other timezones.
There is one more tzinfo method that a subclass may wish to override:
tzinfo.fromutc(self, dt)~
This is called from the default datetime.astimezone() implementation.
When called from that, ``dt.tzinfo`` is {self}, and {dt}'s date and time members
are to be viewed as expressing a UTC time. The purpose of fromutc is to
adjust the date and time members, returning an equivalent datetime in {self}'s
local time.
Most tzinfo subclasses should be able to inherit the default
fromutc implementation without problems. It's strong enough to handle
fixed-offset time zones, and time zones accounting for both standard and
daylight time, and the latter even if the DST transition times differ in
different years. An example of a time zone the default fromutc
implementation may not handle correctly in all cases is one where the standard
offset (from UTC) depends on the specific date and time passed, which can happen
for political reasons. The default implementations of astimezone and
fromutc may not produce the result you want if the result is one of the
hours straddling the moment the standard offset changes.
Skipping code for error cases, the default fromutc implementation acts
like:: >
def fromutc(self, dt):
# raise ValueError error if dt.tzinfo is not self
dtoff = dt.utcoffset()
dtdst = dt.dst()
# raise ValueError if dtoff is None or dtdst is None
delta = dtoff - dtdst # this is self's standard offset
if delta:
dt += delta # convert to standard local time
dtdst = dt.dst()
# raise ValueError if dtdst is None
if dtdst:
return dt + dtdst
else:
return dt
<
Example tzinfo classes:
.. literalinclude:: ../includes/tzinfo-examples.py
Note that there are unavoidable subtleties twice per year in a tzinfo
subclass accounting for both standard and daylight time, at the DST transition
points. For concreteness, consider US Eastern (UTC -0500), where EDT begins the
minute after 1:59 (EST) on the second Sunday in March, and ends the minute after
1:59 (EDT) on the first Sunday in November:: >
UTC 3:MM 4:MM 5:MM 6:MM 7:MM 8:MM
EST 22:MM 23:MM 0:MM 1:MM 2:MM 3:MM
EDT 23:MM 0:MM 1:MM 2:MM 3:MM 4:MM
start 22:MM 23:MM 0:MM 1:MM 3:MM 4:MM
end 23:MM 0:MM 1:MM 1:MM 2:MM 3:MM
<
When DST starts (the "start" line), the local wall clock leaps from 1:59 to
3:00. A wall time of the form 2:MM doesn't really make sense on that day, so
``astimezone(Eastern)`` won't deliver a result with ``hour == 2`` on the day DST
begins. In order for astimezone to make this guarantee, the
rzinfo.dst method must consider times in the "missing hour" (2:MM for
Eastern) to be in daylight time.
When DST ends (the "end" line), there's a potentially worse problem: there's an
hour that can't be spelled unambiguously in local wall time: the last hour of
daylight time. In Eastern, that's times of the form 5:MM UTC on the day
daylight time ends. The local wall clock leaps from 1:59 (daylight time) back
to 1:00 (standard time) again. Local times of the form 1:MM are ambiguous.
astimezone mimics the local clock's behavior by mapping two adjacent UTC
hours into the same local hour then. In the Eastern example, UTC times of the
form 5:MM and 6:MM both map to 1:MM when converted to Eastern. In order for
astimezone to make this guarantee, the tzinfo.dst method must
consider times in the "repeated hour" to be in standard time. This is easily
arranged, as in the example, by expressing DST switch times in the time zone's
standard local time.
Applications that can't bear such ambiguities should avoid using hybrid
tzinfo subclasses; there are no ambiguities when using UTC, or any
other fixed-offset tzinfo subclass (such as a class representing only
EST (fixed offset -5 hours), or only EDT (fixed offset -4 hours)).
strftime and strptime Behavior
----------------------------------------------
date, datetime (|py2stdlib-datetime|), and time (|py2stdlib-time|) objects all support a
``strftime(format)`` method, to create a string representing the time under the
control of an explicit format string. Broadly speaking, ``d.strftime(fmt)``
acts like the time (|py2stdlib-time|) module's ``time.strftime(fmt, d.timetuple())``
although not all objects support a timetuple method.
Conversely, the datetime.strptime class method creates a
datetime (|py2stdlib-datetime|) object from a string representing a date and time and a
corresponding format string. ``datetime.strptime(date_string, format)`` is
equivalent to ``datetime(*(time.strptime(date_string, format)[0:6]))``.
For time (|py2stdlib-time|) objects, the format codes for year, month, and day should not
be used, as time objects have no such values. If they're used anyway, ``1900``
is substituted for the year, and ``1`` for the month and day.
For date objects, the format codes for hours, minutes, seconds, and
microseconds should not be used, as date objects have no such
values. If they're used anyway, ``0`` is substituted for them.
.. versionadded:: 2.6
time (|py2stdlib-time|) and datetime (|py2stdlib-datetime|) objects support a ``%f`` format code
which expands to the number of microseconds in the object, zero-padded on
the left to six places.
For a naive object, the ``%z`` and ``%Z`` format codes are replaced by empty
strings.
For an aware object:
``%z``
utcoffset is transformed into a 5-character string of the form +HHMM or
-HHMM, where HH is a 2-digit string giving the number of UTC offset hours, and
MM is a 2-digit string giving the number of UTC offset minutes. For example, if
utcoffset returns ``timedelta(hours=-3, minutes=-30)``, ``%z`` is
replaced with the string ``'-0330'``.
``%Z``
If tzname returns ``None``, ``%Z`` is replaced by an empty string.
Otherwise ``%Z`` is replaced by the returned value, which must be a string.
The full set of format codes supported varies across platforms, because Python
calls the platform C library's strftime function, and platform
variations are common.
The following is a list of all the format codes that the C standard (1989
version) requires, and these work on all platforms with a standard C
implementation. Note that the 1999 version of the C standard added additional
format codes.
The exact range of years for which strftime works also varies across
platforms. Regardless of platform, years before 1900 cannot be used.
+-----------+--------------------------------+-------+
| Directive | Meaning | Notes |
+===========+================================+=======+
| ``%a`` | Locale's abbreviated weekday | |
| | name. | |
+-----------+--------------------------------+-------+
| ``%A`` | Locale's full weekday name. | |
+-----------+--------------------------------+-------+
| ``%b`` | Locale's abbreviated month | |
| | name. | |
+-----------+--------------------------------+-------+
| ``%B`` | Locale's full month name. | |
+-----------+--------------------------------+-------+
| ``%c`` | Locale's appropriate date and | |
| | time representation. | |
+-----------+--------------------------------+-------+
| ``%d`` | Day of the month as a decimal | |
| | number [01,31]. | |
+-----------+--------------------------------+-------+
| ``%f`` | Microsecond as a decimal | \(1) |
| | number [0,999999], zero-padded | |
| | on the left | |
+-----------+--------------------------------+-------+
| ``%H`` | Hour (24-hour clock) as a | |
| | decimal number [00,23]. | |
+-----------+--------------------------------+-------+
| ``%I`` | Hour (12-hour clock) as a | |
| | decimal number [01,12]. | |
+-----------+--------------------------------+-------+
| ``%j`` | Day of the year as a decimal | |
| | number [001,366]. | |
+-----------+--------------------------------+-------+
| ``%m`` | Month as a decimal number | |
| | [01,12]. | |
+-----------+--------------------------------+-------+
| ``%M`` | Minute as a decimal number | |
| | [00,59]. | |
+-----------+--------------------------------+-------+
| ``%p`` | Locale's equivalent of either | \(2) |
| | AM or PM. | |
+-----------+--------------------------------+-------+
| ``%S`` | Second as a decimal number | \(3) |
| | [00,61]. | |
+-----------+--------------------------------+-------+
| ``%U`` | Week number of the year | \(4) |
| | (Sunday as the first day of | |
| | the week) as a decimal number | |
| | [00,53]. All days in a new | |
| | year preceding the first | |
| | Sunday are considered to be in | |
| | week 0. | |
+-----------+--------------------------------+-------+
| ``%w`` | Weekday as a decimal number | |
| | [0(Sunday),6]. | |
+-----------+--------------------------------+-------+
| ``%W`` | Week number of the year | \(4) |
| | (Monday as the first day of | |
| | the week) as a decimal number | |
| | [00,53]. All days in a new | |
| | year preceding the first | |
| | Monday are considered to be in | |
| | week 0. | |
+-----------+--------------------------------+-------+
| ``%x`` | Locale's appropriate date | |
| | representation. | |
+-----------+--------------------------------+-------+
| ``%X`` | Locale's appropriate time | |
| | representation. | |
+-----------+--------------------------------+-------+
| ``%y`` | Year without century as a | |
| | decimal number [00,99]. | |
+-----------+--------------------------------+-------+
| ``%Y`` | Year with century as a decimal | |
| | number. | |
+-----------+--------------------------------+-------+
| ``%z`` | UTC offset in the form +HHMM | \(5) |
| | or -HHMM (empty string if the | |
| | the object is naive). | |
+-----------+--------------------------------+-------+
| ``%Z`` | Time zone name (empty string | |
| | if the object is naive). | |
+-----------+--------------------------------+-------+
| ``%%`` | A literal ``'%'`` character. | |
+-----------+--------------------------------+-------+
Notes:
(1)
When used with the strptime method, the ``%f`` directive
accepts from one to six digits and zero pads on the right. ``%f`` is
an extension to the set of format characters in the C standard (but
implemented separately in datetime objects, and therefore always
available).
(2)
When used with the strptime method, the ``%p`` directive only affects
the output hour field if the ``%I`` directive is used to parse the hour.
(3)
The range really is ``0`` to ``61``; according to the Posix standard this
accounts for leap seconds and the (very rare) double leap seconds.
The time (|py2stdlib-time|) module may produce and does accept leap seconds since
it is based on the Posix standard, but the datetime (|py2stdlib-datetime|) module
does not accept leap seconds in strptime input nor will it
produce them in strftime output.
(4)
When used with the strptime method, ``%U`` and ``%W`` are only used in
calculations when the day of the week and the year are specified.
(5)
For example, if utcoffset returns ``timedelta(hours=-3, minutes=-30)``,
``%z`` is replaced with the string ``'-0330'``.
==============================================================================
*py2stdlib-dbhash*
dbhash~
:synopsis: DBM-style interface to the BSD database library.
2.6~
The dbhash (|py2stdlib-dbhash|) module has been deprecated for removal in Python 3.0.
.. index:: module: bsddb
The dbhash (|py2stdlib-dbhash|) module provides a function to open databases using the BSD
``db`` library. This module mirrors the interface of the other Python database
modules that provide access to DBM-style databases. The bsddb (|py2stdlib-bsddb|) module is
required to use dbhash (|py2stdlib-dbhash|).
This module provides an exception and a function:
error~
Exception raised on database errors other than KeyError. It is a synonym
for bsddb.error.
open(path[, flag[, mode]])~
Open a ``db`` database and return the database object. The {path} argument is
the name of the database file.
The {flag} argument can be:
+---------+-------------------------------------------+
| Value | Meaning |
+=========+===========================================+
| ``'r'`` | Open existing database for reading only |
| | (default) |
+---------+-------------------------------------------+
| ``'w'`` | Open existing database for reading and |
| | writing |
+---------+-------------------------------------------+
| ``'c'`` | Open database for reading and writing, |
| | creating it if it doesn't exist |
+---------+-------------------------------------------+
| ``'n'`` | Always create a new, empty database, open |
| | for reading and writing |
+---------+-------------------------------------------+
For platforms on which the BSD ``db`` library supports locking, an ``'l'``
can be appended to indicate that locking should be used.
The optional {mode} parameter is used to indicate the Unix permission bits that
should be set if a new database must be created; this will be masked by the
current umask value for the process.
.. seealso::
Module anydbm (|py2stdlib-anydbm|)
Generic interface to ``dbm``\ -style databases.
Module bsddb (|py2stdlib-bsddb|)
Lower-level interface to the BSD ``db`` library.
Module whichdb (|py2stdlib-whichdb|)
Utility module used to determine the type of an existing database.
Database Objects
----------------
The database objects returned by .open provide the methods common to all
the DBM-style databases and mapping objects. The following methods are
available in addition to the standard methods.
dbhash.first()~
It's possible to loop over every key/value pair in the database using this
method and the !next method. The traversal is ordered by the databases
internal hash values, and won't be sorted by the key values. This method
returns the starting key.
dbhash.last()~
Return the last key/value pair in a database traversal. This may be used to
begin a reverse-order traversal; see previous.
dbhash.next()~
Returns the key next key/value pair in a database traversal. The following code
prints every key in the database ``db``, without having to create a list in
memory that contains them all:: >
print db.first()
for i in xrange(1, len(db)):
print db.next()
<
dbhash.previous()~
Returns the previous key/value pair in a forward-traversal of the database. In
conjunction with last, this may be used to implement a reverse-order
traversal.
dbhash.sync()~
This method forces any unwritten data to be written to the disk.
==============================================================================
*py2stdlib-dbm*
dbm~
:platform: Unix
:synopsis: The standard "database" interface, based on ndbm.
.. note::
The dbm (|py2stdlib-dbm|) module has been renamed to dbm.ndbm in Python 3.0. The
2to3 tool will automatically adapt imports when converting your
sources to 3.0.
The dbm (|py2stdlib-dbm|) module provides an interface to the Unix "(n)dbm" library. Dbm
objects behave like mappings (dictionaries), except that keys and values are
always strings. Printing a dbm object doesn't print the keys and values, and the
items and values methods are not supported.
This module can be used with the "classic" ndbm interface, the BSD DB
compatibility interface, or the GNU GDBM compatibility interface. On Unix, the
configure script will attempt to locate the appropriate header file
to simplify building this module.
The module defines the following:
error~
Raised on dbm-specific errors, such as I/O errors. KeyError is raised for
general mapping errors like specifying an incorrect key.
library~
Name of the ``ndbm`` implementation library used.
open(filename[, flag[, mode]])~
Open a dbm database and return a dbm object. The {filename} argument is the
name of the database file (without the .dir or .pag extensions;
note that the BSD DB implementation of the interface will append the extension
.db and only create one file).
The optional {flag} argument must be one of these values:
+---------+-------------------------------------------+
| Value | Meaning |
+=========+===========================================+
| ``'r'`` | Open existing database for reading only |
| | (default) |
+---------+-------------------------------------------+
| ``'w'`` | Open existing database for reading and |
| | writing |
+---------+-------------------------------------------+
| ``'c'`` | Open database for reading and writing, |
| | creating it if it doesn't exist |
+---------+-------------------------------------------+
| ``'n'`` | Always create a new, empty database, open |
| | for reading and writing |
+---------+-------------------------------------------+
The optional {mode} argument is the Unix mode of the file, used only when the
database has to be created. It defaults to octal ``0666`` (and will be
modified by the prevailing umask).
.. seealso::
Module anydbm (|py2stdlib-anydbm|)
Generic interface to ``dbm``\ -style databases.
Module gdbm (|py2stdlib-gdbm|)
Similar interface to the GNU GDBM library.
Module whichdb (|py2stdlib-whichdb|)
Utility module used to determine the type of an existing database.
==============================================================================
*py2stdlib-decimal*
decimal~
:synopsis: Implementation of the General Decimal Arithmetic Specification.
.. versionadded:: 2.4
.. import modules for testing inline doctests with the Sphinx doctest builder
.. testsetup:: *
import decimal
import math
from decimal import *
# make sure each group gets a fresh context
setcontext(Context())
The decimal (|py2stdlib-decimal|) module provides support for decimal floating point
arithmetic. It offers several advantages over the float datatype:
* Decimal "is based on a floating-point model which was designed with people
in mind, and necessarily has a paramount guiding principle -- computers must
provide an arithmetic that works in the same way as the arithmetic that
people learn at school." -- excerpt from the decimal arithmetic specification.
* Decimal numbers can be represented exactly. In contrast, numbers like
1.1 and 2.2 do not have an exact representations in binary
floating point. End users typically would not expect ``1.1 + 2.2`` to display
as 3.3000000000000003 as it does with binary floating point.
* The exactness carries over into arithmetic. In decimal floating point, ``0.1
+ 0.1 + 0.1 - 0.3`` is exactly equal to zero. In binary floating point, the result
is 5.5511151231257827e-017. While near to zero, the differences
prevent reliable equality testing and differences can accumulate. For this
reason, decimal is preferred in accounting applications which have strict
equality invariants.
* The decimal module incorporates a notion of significant places so that ``1.30
+ 1.20`` is 2.50. The trailing zero is kept to indicate significance.
This is the customary presentation for monetary applications. For
multiplication, the "schoolbook" approach uses all the figures in the
multiplicands. For instance, ``1.3 { 1.2`` gives 1.56 while ``1.30 }
1.20`` gives 1.5600.
* Unlike hardware based binary floating point, the decimal module has a user
alterable precision (defaulting to 28 places) which can be as large as needed for
a given problem:
>>> getcontext().prec = 6
>>> Decimal(1) / Decimal(7)
Decimal('0.142857')
>>> getcontext().prec = 28
>>> Decimal(1) / Decimal(7)
Decimal('0.1428571428571428571428571429')
* Both binary and decimal floating point are implemented in terms of published
standards. While the built-in float type exposes only a modest portion of its
capabilities, the decimal module exposes all required parts of the standard.
When needed, the programmer has full control over rounding and signal handling.
This includes an option to enforce exact arithmetic by using exceptions
to block any inexact operations.
* The decimal module was designed to support "without prejudice, both exact
unrounded decimal arithmetic (sometimes called fixed-point arithmetic)
and rounded floating-point arithmetic." -- excerpt from the decimal
arithmetic specification.
The module design is centered around three concepts: the decimal number, the
context for arithmetic, and signals.
A decimal number is immutable. It has a sign, coefficient digits, and an
exponent. To preserve significance, the coefficient digits do not truncate
trailing zeros. Decimals also include special values such as
Infinity, -Infinity, and NaN. The standard also
differentiates -0 from +0.
The context for arithmetic is an environment specifying precision, rounding
rules, limits on exponents, flags indicating the results of operations, and trap
enablers which determine whether signals are treated as exceptions. Rounding
options include ROUND_CEILING, ROUND_DOWN,
ROUND_FLOOR, ROUND_HALF_DOWN, ROUND_HALF_EVEN,
ROUND_HALF_UP, ROUND_UP, and ROUND_05UP.
Signals are groups of exceptional conditions arising during the course of
computation. Depending on the needs of the application, signals may be ignored,
considered as informational, or treated as exceptions. The signals in the
decimal module are: Clamped, InvalidOperation,
DivisionByZero, Inexact, Rounded, Subnormal,
Overflow, and Underflow.
For each signal there is a flag and a trap enabler. When a signal is
encountered, its flag is set to one, then, if the trap enabler is
set to one, an exception is raised. Flags are sticky, so the user needs to
reset them before monitoring a calculation.
.. seealso::
* IBM's General Decimal Arithmetic Specification, `The General Decimal Arithmetic
Specification <http://speleotrove.com/decimal/>`_.
* IEEE standard 854-1987, `Unofficial IEEE 854 Text
<http://754r.ucbtest.org/standards/854.pdf>`_.
.. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Quick-start Tutorial
--------------------
The usual start to using decimals is importing the module, viewing the current
context with getcontext and, if necessary, setting new values for
precision, rounding, or enabled traps:: >
>>> from decimal import *
>>> getcontext()
Context(prec=28, rounding=ROUND_HALF_EVEN, Emin=-999999999, Emax=999999999,
capitals=1, flags=[], traps=[Overflow, DivisionByZero,
InvalidOperation])
>>> getcontext().prec = 7 # Set a new precision
<
Decimal instances can be constructed from integers, strings, floats, or tuples.
Construction from an integer or a float performs an exact conversion of the
value of that integer or float. Decimal numbers include special values such as
NaN which stands for "Not a number", positive and negative
Infinity, and -0.
>>> getcontext().prec = 28
>>> Decimal(10)
Decimal('10')
>>> Decimal('3.14')
Decimal('3.14')
>>> Decimal(3.14)
Decimal('3.140000000000000124344978758017532527446746826171875')
>>> Decimal((0, (3, 1, 4), -2))
Decimal('3.14')
>>> Decimal(str(2.0 {} 0.5))
Decimal('1.41421356237')
>>> Decimal(2) {} Decimal('0.5')
Decimal('1.414213562373095048801688724')
>>> Decimal('NaN')
Decimal('NaN')
>>> Decimal('-Infinity')
Decimal('-Infinity')
The significance of a new Decimal is determined solely by the number of digits
input. Context precision and rounding only come into play during arithmetic
operations.
.. doctest:: newcontext
>>> getcontext().prec = 6
>>> Decimal('3.0')
Decimal('3.0')
>>> Decimal('3.1415926535')
Decimal('3.1415926535')
>>> Decimal('3.1415926535') + Decimal('2.7182818285')
Decimal('5.85987')
>>> getcontext().rounding = ROUND_UP
>>> Decimal('3.1415926535') + Decimal('2.7182818285')
Decimal('5.85988')
Decimals interact well with much of the rest of Python. Here is a small decimal
floating point flying circus:
.. doctest::
:options: +NORMALIZE_WHITESPACE
>>> data = map(Decimal, '1.34 1.87 3.45 2.35 1.00 0.03 9.25'.split())
>>> max(data)
Decimal('9.25')
>>> min(data)
Decimal('0.03')
>>> sorted(data)
[Decimal('0.03'), Decimal('1.00'), Decimal('1.34'), Decimal('1.87'),
Decimal('2.35'), Decimal('3.45'), Decimal('9.25')]
>>> sum(data)
Decimal('19.29')
>>> a,b,c = data[:3]
>>> str(a)
'1.34'
>>> float(a)
1.34
>>> round(a, 1) # round() first converts to binary floating point
1.3
>>> int(a)
1
>>> a * 5
Decimal('6.70')
>>> a * b
Decimal('2.5058')
>>> c % a
Decimal('0.77')
And some mathematical functions are also available to Decimal:
>>> getcontext().prec = 28
>>> Decimal(2).sqrt()
Decimal('1.414213562373095048801688724')
>>> Decimal(1).exp()
Decimal('2.718281828459045235360287471')
>>> Decimal('10').ln()
Decimal('2.302585092994045684017991455')
>>> Decimal('10').log10()
Decimal('1')
The quantize method rounds a number to a fixed exponent. This method is
useful for monetary applications that often round results to a fixed number of
places:
>>> Decimal('7.325').quantize(Decimal('.01'), rounding=ROUND_DOWN)
Decimal('7.32')
>>> Decimal('7.325').quantize(Decimal('1.'), rounding=ROUND_UP)
Decimal('8')
As shown above, the getcontext function accesses the current context and
allows the settings to be changed. This approach meets the needs of most
applications.
For more advanced work, it may be useful to create alternate contexts using the
Context() constructor. To make an alternate active, use the setcontext
function.
In accordance with the standard, the Decimal module provides two ready to
use standard contexts, BasicContext and ExtendedContext. The
former is especially useful for debugging because many of the traps are
enabled:
.. doctest:: newcontext
:options: +NORMALIZE_WHITESPACE
>>> myothercontext = Context(prec=60, rounding=ROUND_HALF_DOWN)
>>> setcontext(myothercontext)
>>> Decimal(1) / Decimal(7)
Decimal('0.142857142857142857142857142857142857142857142857142857142857')
>>> ExtendedContext
Context(prec=9, rounding=ROUND_HALF_EVEN, Emin=-999999999, Emax=999999999,
capitals=1, flags=[], traps=[])
>>> setcontext(ExtendedContext)
>>> Decimal(1) / Decimal(7)
Decimal('0.142857143')
>>> Decimal(42) / Decimal(0)
Decimal('Infinity')
>>> setcontext(BasicContext)
>>> Decimal(42) / Decimal(0)
Traceback (most recent call last):
File "<pyshell#143>", line 1, in -toplevel-
Decimal(42) / Decimal(0)
DivisionByZero: x / 0
Contexts also have signal flags for monitoring exceptional conditions
encountered during computations. The flags remain set until explicitly cleared,
so it is best to clear the flags before each set of monitored computations by
using the clear_flags method. :: >
>>> setcontext(ExtendedContext)
>>> getcontext().clear_flags()
>>> Decimal(355) / Decimal(113)
Decimal('3.14159292')
>>> getcontext()
Context(prec=9, rounding=ROUND_HALF_EVEN, Emin=-999999999, Emax=999999999,
capitals=1, flags=[Rounded, Inexact], traps=[])
<
The {flags} entry shows that the rational approximation to Pi was
rounded (digits beyond the context precision were thrown away) and that the
result is inexact (some of the discarded digits were non-zero).
Individual traps are set using the dictionary in the traps field of a
context:
.. doctest:: newcontext
>>> setcontext(ExtendedContext)
>>> Decimal(1) / Decimal(0)
Decimal('Infinity')
>>> getcontext().traps[DivisionByZero] = 1
>>> Decimal(1) / Decimal(0)
Traceback (most recent call last):
File "<pyshell#112>", line 1, in -toplevel-
Decimal(1) / Decimal(0)
DivisionByZero: x / 0
Most programs adjust the current context only once, at the beginning of the
program. And, in many applications, data is converted to Decimal with
a single cast inside a loop. With context set and decimals created, the bulk of
the program manipulates the data no differently than with other Python numeric
types.
.. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Decimal objects
---------------
Decimal([value [, context]])~
Construct a new Decimal object based from {value}.
{value} can be an integer, string, tuple, float, or another Decimal
object. If no {value} is given, returns ``Decimal('0')``. If {value} is a
string, it should conform to the decimal numeric string syntax after leading
and trailing whitespace characters are removed:: >
sign ::= '+' | '-'
digit ::= '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'
indicator ::= 'e' | 'E'
digits ::= digit [digit]...
decimal-part ::= digits '.' [digits] | ['.'] digits
exponent-part ::= indicator [sign] digits
infinity ::= 'Infinity' | 'Inf'
nan ::= 'NaN' [digits] | 'sNaN' [digits]
numeric-value ::= decimal-part [exponent-part] | infinity
numeric-string ::= [sign] numeric-value | [sign] nan
<
If {value} is a unicode string then other Unicode decimal digits
are also permitted where ``digit`` appears above. These include
decimal digits from various other alphabets (for example,
Arabic-Indic and Devanāgarī digits) along with the fullwidth digits
``u'\uff10'`` through ``u'\uff19'``.
If {value} is a tuple, it should have three components, a sign
(0 for positive or 1 for negative), a tuple of
digits, and an integer exponent. For example, ``Decimal((0, (1, 4, 1, 4), -3))``
returns ``Decimal('1.414')``.
If {value} is a float, the binary floating point value is losslessly
converted to its exact decimal equivalent. This conversion can often require
53 or more digits of precision. For example, ``Decimal(float('1.1'))``
converts to
``Decimal('1.100000000000000088817841970012523233890533447265625')``.
The {context} precision does not affect how many digits are stored. That is
determined exclusively by the number of digits in {value}. For example,
``Decimal('3.00000')`` records all five zeros even if the context precision is
only three.
The purpose of the {context} argument is determining what to do if {value} is a
malformed string. If the context traps InvalidOperation, an exception
is raised; otherwise, the constructor returns a new Decimal with the value of
NaN.
Once constructed, Decimal objects are immutable.
.. versionchanged:: 2.6
leading and trailing whitespace characters are permitted when
creating a Decimal instance from a string.
.. versionchanged:: 2.7
The argument to the constructor is now permitted to be a float instance.
Decimal floating point objects share many properties with the other built-in
numeric types such as float and int. All of the usual math
operations and special methods apply. Likewise, decimal objects can be
copied, pickled, printed, used as dictionary keys, used as set elements,
compared, sorted, and coerced to another type (such as float or
long).
Decimal objects cannot generally be combined with floats in
arithmetic operations: an attempt to add a Decimal to a
float, for example, will raise a TypeError.
There's one exception to this rule: it's possible to use Python's
comparison operators to compare a float instance ``x``
with a Decimal instance ``y``. Without this exception,
comparisons between Decimal and float instances
would follow the general rules for comparing objects of different
types described in the expressions section of the reference
manual, leading to confusing results.
.. versionchanged:: 2.7
A comparison between a float instance ``x`` and a
Decimal instance ``y`` now returns a result based on
the values of ``x`` and ``y``. In earlier versions ``x < y``
returned the same (arbitrary) result for any Decimal
instance ``x`` and any float instance ``y``.
In addition to the standard numeric properties, decimal floating point
objects also have a number of specialized methods:
adjusted()~
Return the adjusted exponent after shifting out the coefficient's
rightmost digits until only the lead digit remains:
``Decimal('321e+5').adjusted()`` returns seven. Used for determining the
position of the most significant digit with respect to the decimal point.
as_tuple()~
Return a named tuple representation of the number:
``DecimalTuple(sign, digits, exponent)``.
.. versionchanged:: 2.6
Use a named tuple.
canonical()~
Return the canonical encoding of the argument. Currently, the encoding of
a Decimal instance is always canonical, so this operation returns
its argument unchanged.
.. versionadded:: 2.6
compare(other[, context])~
Compare the values of two Decimal instances. This operation behaves in
the same way as the usual comparison method __cmp__, except that
compare returns a Decimal instance rather than an integer, and if
either operand is a NaN then the result is a NaN:: >
a or b is a NaN ==> Decimal('NaN')
a < b ==> Decimal('-1')
a == b ==> Decimal('0')
a > b ==> Decimal('1')
<
compare_signal(other[, context])~
This operation is identical to the compare method, except that all
NaNs signal. That is, if neither operand is a signaling NaN then any
quiet NaN operand is treated as though it were a signaling NaN.
.. versionadded:: 2.6
compare_total(other)~
Compare two operands using their abstract representation rather than their
numerical value. Similar to the compare method, but the result
gives a total ordering on Decimal instances. Two
Decimal instances with the same numeric value but different
representations compare unequal in this ordering:
>>> Decimal('12.0').compare_total(Decimal('12'))
Decimal('-1')
Quiet and signaling NaNs are also included in the total ordering. The
result of this function is ``Decimal('0')`` if both operands have the same
representation, ``Decimal('-1')`` if the first operand is lower in the
total order than the second, and ``Decimal('1')`` if the first operand is
higher in the total order than the second operand. See the specification
for details of the total order.
.. versionadded:: 2.6
compare_total_mag(other)~
Compare two operands using their abstract representation rather than their
value as in compare_total, but ignoring the sign of each operand.
``x.compare_total_mag(y)`` is equivalent to
``x.copy_abs().compare_total(y.copy_abs())``.
.. versionadded:: 2.6
conjugate()~
Just returns self, this method is only to comply with the Decimal
Specification.
.. versionadded:: 2.6
copy_abs()~
Return the absolute value of the argument. This operation is unaffected
by the context and is quiet: no flags are changed and no rounding is
performed.
.. versionadded:: 2.6
copy_negate()~
Return the negation of the argument. This operation is unaffected by the
context and is quiet: no flags are changed and no rounding is performed.
.. versionadded:: 2.6
copy_sign(other)~
Return a copy of the first operand with the sign set to be the same as the
sign of the second operand. For example:
>>> Decimal('2.3').copy_sign(Decimal('-1.5'))
Decimal('-2.3')
This operation is unaffected by the context and is quiet: no flags are
changed and no rounding is performed.
.. versionadded:: 2.6
exp([context])~
Return the value of the (natural) exponential function ``e{}x`` at the
given number. The result is correctly rounded using the
ROUND_HALF_EVEN rounding mode.
>>> Decimal(1).exp()
Decimal('2.718281828459045235360287471')
>>> Decimal(321).exp()
Decimal('2.561702493119680037517373933E+139')
.. versionadded:: 2.6
from_float(f)~
Classmethod that converts a float to a decimal number, exactly.
Note `Decimal.from_float(0.1)` is not the same as `Decimal('0.1')`.
Since 0.1 is not exactly representable in binary floating point, the
value is stored as the nearest representable value which is
`0x1.999999999999ap-4`. That equivalent value in decimal is
`0.1000000000000000055511151231257827021181583404541015625`.
.. note:: From Python 2.7 onwards, a Decimal instance
can also be constructed directly from a float.
.. doctest:: >
>>> Decimal.from_float(0.1)
Decimal('0.1000000000000000055511151231257827021181583404541015625')
>>> Decimal.from_float(float('nan'))
Decimal('NaN')
>>> Decimal.from_float(float('inf'))
Decimal('Infinity')
>>> Decimal.from_float(float('-inf'))
Decimal('-Infinity')
<
.. versionadded:: 2.7
fma(other, third[, context])~
Fused multiply-add. Return self*other+third with no rounding of the
intermediate product self*other.
>>> Decimal(2).fma(3, 5)
Decimal('11')
.. versionadded:: 2.6
is_canonical()~
Return True if the argument is canonical and False
otherwise. Currently, a Decimal instance is always canonical, so
this operation always returns True.
.. versionadded:: 2.6
is_finite()~
Return True if the argument is a finite number, and
False if the argument is an infinity or a NaN.
.. versionadded:: 2.6
is_infinite()~
Return True if the argument is either positive or negative
infinity and False otherwise.
.. versionadded:: 2.6
is_nan()~
Return True if the argument is a (quiet or signaling) NaN and
False otherwise.
.. versionadded:: 2.6
is_normal()~
Return True if the argument is a {normal} finite non-zero
number with an adjusted exponent greater than or equal to {Emin}.
Return False if the argument is zero, subnormal, infinite or a
NaN. Note, the term {normal} is used here in a different sense with
the normalize method which is used to create canonical values.
.. versionadded:: 2.6
is_qnan()~
Return True if the argument is a quiet NaN, and
False otherwise.
.. versionadded:: 2.6
is_signed()~
Return True if the argument has a negative sign and
False otherwise. Note that zeros and NaNs can both carry signs.
.. versionadded:: 2.6
is_snan()~
Return True if the argument is a signaling NaN and False
otherwise.
.. versionadded:: 2.6
is_subnormal()~
Return True if the argument is subnormal, and False
otherwise. A number is subnormal is if it is nonzero, finite, and has an
adjusted exponent less than {Emin}.
.. versionadded:: 2.6
is_zero()~
Return True if the argument is a (positive or negative) zero and
False otherwise.
.. versionadded:: 2.6
ln([context])~
Return the natural (base e) logarithm of the operand. The result is
correctly rounded using the ROUND_HALF_EVEN rounding mode.
.. versionadded:: 2.6
log10([context])~
Return the base ten logarithm of the operand. The result is correctly
rounded using the ROUND_HALF_EVEN rounding mode.
.. versionadded:: 2.6
logb([context])~
For a nonzero number, return the adjusted exponent of its operand as a
Decimal instance. If the operand is a zero then
``Decimal('-Infinity')`` is returned and the DivisionByZero flag
is raised. If the operand is an infinity then ``Decimal('Infinity')`` is
returned.
.. versionadded:: 2.6
logical_and(other[, context])~
logical_and is a logical operation which takes two *logical
operands* (see logical_operands_label). The result is the
digit-wise ``and`` of the two operands.
.. versionadded:: 2.6
logical_invert([context])~
logical_invert is a logical operation. The
result is the digit-wise inversion of the operand.
.. versionadded:: 2.6
logical_or(other[, context])~
logical_or is a logical operation which takes two *logical
operands* (see logical_operands_label). The result is the
digit-wise ``or`` of the two operands.
.. versionadded:: 2.6
logical_xor(other[, context])~
logical_xor is a logical operation which takes two *logical
operands* (see logical_operands_label). The result is the
digit-wise exclusive or of the two operands.
.. versionadded:: 2.6
max(other[, context])~
Like ``max(self, other)`` except that the context rounding rule is applied
before returning and that NaN values are either signaled or
ignored (depending on the context and whether they are signaling or
quiet).
max_mag(other[, context])~
Similar to the .max method, but the comparison is done using the
absolute values of the operands.
.. versionadded:: 2.6
min(other[, context])~
Like ``min(self, other)`` except that the context rounding rule is applied
before returning and that NaN values are either signaled or
ignored (depending on the context and whether they are signaling or
quiet).
min_mag(other[, context])~
Similar to the .min method, but the comparison is done using the
absolute values of the operands.
.. versionadded:: 2.6
next_minus([context])~
Return the largest number representable in the given context (or in the
current thread's context if no context is given) that is smaller than the
given operand.
.. versionadded:: 2.6
next_plus([context])~
Return the smallest number representable in the given context (or in the
current thread's context if no context is given) that is larger than the
given operand.
.. versionadded:: 2.6
next_toward(other[, context])~
If the two operands are unequal, return the number closest to the first
operand in the direction of the second operand. If both operands are
numerically equal, return a copy of the first operand with the sign set to
be the same as the sign of the second operand.
.. versionadded:: 2.6
normalize([context])~
Normalize the number by stripping the rightmost trailing zeros and
converting any result equal to Decimal('0') to
Decimal('0e0'). Used for producing canonical values for members
of an equivalence class. For example, ``Decimal('32.100')`` and
``Decimal('0.321000e+2')`` both normalize to the equivalent value
``Decimal('32.1')``.
number_class([context])~
Return a string describing the {class} of the operand. The returned value
is one of the following ten strings.
* ``"-Infinity"``, indicating that the operand is negative infinity.
* ``"-Normal"``, indicating that the operand is a negative normal number.
* ``"-Subnormal"``, indicating that the operand is negative and subnormal.
* ``"-Zero"``, indicating that the operand is a negative zero.
* ``"+Zero"``, indicating that the operand is a positive zero.
* ``"+Subnormal"``, indicating that the operand is positive and subnormal.
* ``"+Normal"``, indicating that the operand is a positive normal number.
* ``"+Infinity"``, indicating that the operand is positive infinity.
* ``"NaN"``, indicating that the operand is a quiet NaN (Not a Number).
* ``"sNaN"``, indicating that the operand is a signaling NaN.
.. versionadded:: 2.6
quantize(exp[, rounding[, context[, watchexp]]])~
Return a value equal to the first operand after rounding and having the
exponent of the second operand.
>>> Decimal('1.41421356').quantize(Decimal('1.000'))
Decimal('1.414')
Unlike other operations, if the length of the coefficient after the
quantize operation would be greater than precision, then an
InvalidOperation is signaled. This guarantees that, unless there
is an error condition, the quantized exponent is always equal to that of
the right-hand operand.
Also unlike other operations, quantize never signals Underflow, even if
the result is subnormal and inexact.
If the exponent of the second operand is larger than that of the first
then rounding may be necessary. In this case, the rounding mode is
determined by the ``rounding`` argument if given, else by the given
``context`` argument; if neither argument is given the rounding mode of
the current thread's context is used.
If {watchexp} is set (default), then an error is returned whenever the
resulting exponent is greater than Emax or less than
Etiny.
radix()~
Return ``Decimal(10)``, the radix (base) in which the Decimal
class does all its arithmetic. Included for compatibility with the
specification.
.. versionadded:: 2.6
remainder_near(other[, context])~
Compute the modulo as either a positive or negative value depending on
which is closest to zero. For instance, ``Decimal(10).remainder_near(6)``
returns ``Decimal('-2')`` which is closer to zero than ``Decimal('4')``.
If both are equally close, the one chosen will have the same sign as
{self}.
rotate(other[, context])~
Return the result of rotating the digits of the first operand by an amount
specified by the second operand. The second operand must be an integer in
the range -precision through precision. The absolute value of the second
operand gives the number of places to rotate. If the second operand is
positive then rotation is to the left; otherwise rotation is to the right.
The coefficient of the first operand is padded on the left with zeros to
length precision if necessary. The sign and exponent of the first operand
are unchanged.
.. versionadded:: 2.6
same_quantum(other[, context])~
Test whether self and other have the same exponent or whether both are
NaN.
scaleb(other[, context])~
Return the first operand with exponent adjusted by the second.
Equivalently, return the first operand multiplied by ``10{}other``. The
second operand must be an integer.
.. versionadded:: 2.6
shift(other[, context])~
Return the result of shifting the digits of the first operand by an amount
specified by the second operand. The second operand must be an integer in
the range -precision through precision. The absolute value of the second
operand gives the number of places to shift. If the second operand is
positive then the shift is to the left; otherwise the shift is to the
right. Digits shifted into the coefficient are zeros. The sign and
exponent of the first operand are unchanged.
.. versionadded:: 2.6
sqrt([context])~
Return the square root of the argument to full precision.
to_eng_string([context])~
Convert to an engineering-type string.
Engineering notation has an exponent which is a multiple of 3, so there
are up to 3 digits left of the decimal place. For example, converts
``Decimal('123E+1')`` to ``Decimal('1.23E+3')``
to_integral([rounding[, context]])~
Identical to the to_integral_value method. The ``to_integral``
name has been kept for compatibility with older versions.
to_integral_exact([rounding[, context]])~
Round to the nearest integer, signaling Inexact or
Rounded as appropriate if rounding occurs. The rounding mode is
determined by the ``rounding`` parameter if given, else by the given
``context``. If neither parameter is given then the rounding mode of the
current context is used.
.. versionadded:: 2.6
to_integral_value([rounding[, context]])~
Round to the nearest integer without signaling Inexact or
Rounded. If given, applies {rounding}; otherwise, uses the
rounding method in either the supplied {context} or the current context.
.. versionchanged:: 2.6
renamed from ``to_integral`` to ``to_integral_value``. The old name
remains valid for compatibility.
Logical operands
^^^^^^^^^^^^^^^^
The logical_and, logical_invert, logical_or,
and logical_xor methods expect their arguments to be *logical
operands{. A }logical operand* is a Decimal instance whose
exponent and sign are both zero, and whose digits are all either
0 or 1.
.. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Context objects
---------------
Contexts are environments for arithmetic operations. They govern precision, set
rules for rounding, determine which signals are treated as exceptions, and limit
the range for exponents.
Each thread has its own current context which is accessed or changed using the
getcontext and setcontext functions:
getcontext()~
Return the current context for the active thread.
setcontext(c)~
Set the current context for the active thread to {c}.
Beginning with Python 2.5, you can also use the with statement and
the localcontext function to temporarily change the active context.
localcontext([c])~
Return a context manager that will set the current context for the active thread
to a copy of {c} on entry to the with-statement and restore the previous context
when exiting the with-statement. If no context is specified, a copy of the
current context is used.
.. versionadded:: 2.5
For example, the following code sets the current decimal precision to 42 places,
performs a calculation, and then automatically restores the previous context:: >
from decimal import localcontext
with localcontext() as ctx:
ctx.prec = 42 # Perform a high precision calculation
s = calculate_something()
s = +s # Round the final result back to the default precision
<
New contexts can also be created using the Context constructor
described below. In addition, the module provides three pre-made contexts:
BasicContext~
This is a standard context defined by the General Decimal Arithmetic
Specification. Precision is set to nine. Rounding is set to
ROUND_HALF_UP. All flags are cleared. All traps are enabled (treated
as exceptions) except Inexact, Rounded, and
Subnormal.
Because many of the traps are enabled, this context is useful for debugging.
ExtendedContext~
This is a standard context defined by the General Decimal Arithmetic
Specification. Precision is set to nine. Rounding is set to
ROUND_HALF_EVEN. All flags are cleared. No traps are enabled (so that
exceptions are not raised during computations).
Because the traps are disabled, this context is useful for applications that
prefer to have result value of NaN or Infinity instead of
raising exceptions. This allows an application to complete a run in the
presence of conditions that would otherwise halt the program.
DefaultContext~
This context is used by the Context constructor as a prototype for new
contexts. Changing a field (such a precision) has the effect of changing the
default for new contexts created by the Context constructor.
This context is most useful in multi-threaded environments. Changing one of the
fields before threads are started has the effect of setting system-wide
defaults. Changing the fields after threads have started is not recommended as
it would require thread synchronization to prevent race conditions.
In single threaded environments, it is preferable to not use this context at
all. Instead, simply create contexts explicitly as described below.
The default values are precision=28, rounding=ROUND_HALF_EVEN, and enabled traps
for Overflow, InvalidOperation, and DivisionByZero.
In addition to the three supplied contexts, new contexts can be created with the
Context constructor.
Context(prec=None, rounding=None, traps=None, flags=None, Emin=None, Emax=None, capitals=1)~
Creates a new context. If a field is not specified or is None, the
default values are copied from the DefaultContext. If the {flags}
field is not specified or is None, all flags are cleared.
The {prec} field is a positive integer that sets the precision for arithmetic
operations in the context.
The {rounding} option is one of:
* ROUND_CEILING (towards Infinity),
* ROUND_DOWN (towards zero),
* ROUND_FLOOR (towards -Infinity),
* ROUND_HALF_DOWN (to nearest with ties going towards zero),
* ROUND_HALF_EVEN (to nearest with ties going to nearest even integer),
* ROUND_HALF_UP (to nearest with ties going away from zero), or
* ROUND_UP (away from zero).
* ROUND_05UP (away from zero if last digit after rounding towards zero
would have been 0 or 5; otherwise towards zero)
The {traps} and {flags} fields list any signals to be set. Generally, new
contexts should only set traps and leave the flags clear.
The {Emin} and {Emax} fields are integers specifying the outer limits allowable
for exponents.
The {capitals} field is either 0 or 1 (the default). If set to
1, exponents are printed with a capital E; otherwise, a
lowercase e is used: Decimal('6.02e+23').
.. versionchanged:: 2.6
The ROUND_05UP rounding mode was added.
The Context class defines several general purpose methods as well as
a large number of methods for doing arithmetic directly in a given context.
In addition, for each of the Decimal methods described above (with
the exception of the adjusted and as_tuple methods) there is
a corresponding Context method. For example, for a Context
instance ``C`` and Decimal instance ``x``, ``C.exp(x)`` is
equivalent to ``x.exp(context=C)``. Each Context method accepts a
Python integer (an instance of int or long) anywhere that a
Decimal instance is accepted.
clear_flags()~
Resets all of the flags to 0.
copy()~
Return a duplicate of the context.
copy_decimal(num)~
Return a copy of the Decimal instance num.
create_decimal(num)~
Creates a new Decimal instance from {num} but using {self} as
context. Unlike the Decimal constructor, the context precision,
rounding method, flags, and traps are applied to the conversion.
This is useful because constants are often given to a greater precision
than is needed by the application. Another benefit is that rounding
immediately eliminates unintended effects from digits beyond the current
precision. In the following example, using unrounded inputs means that
adding zero to a sum can change the result:
.. doctest:: newcontext
>>> getcontext().prec = 3
>>> Decimal('3.4445') + Decimal('1.0023')
Decimal('4.45')
>>> Decimal('3.4445') + Decimal(0) + Decimal('1.0023')
Decimal('4.44')
This method implements the to-number operation of the IBM specification.
If the argument is a string, no leading or trailing whitespace is
permitted.
create_decimal_from_float(f)~
Creates a new Decimal instance from a float {f} but rounding using {self}
as the context. Unlike the Decimal.from_float class method,
the context precision, rounding method, flags, and traps are applied to
the conversion.
.. doctest:: >
>>> context = Context(prec=5, rounding=ROUND_DOWN)
>>> context.create_decimal_from_float(math.pi)
Decimal('3.1415')
>>> context = Context(prec=5, traps=[Inexact])
>>> context.create_decimal_from_float(math.pi)
Traceback (most recent call last):
...
Inexact: None
<
.. versionadded:: 2.7
Etiny()~
Returns a value equal to ``Emin - prec + 1`` which is the minimum exponent
value for subnormal results. When underflow occurs, the exponent is set
to Etiny.
Etop()~
Returns a value equal to ``Emax - prec + 1``.
The usual approach to working with decimals is to create Decimal
instances and then apply arithmetic operations which take place within the
current context for the active thread. An alternative approach is to use
context methods for calculating within a specific context. The methods are
similar to those for the Decimal class and are only briefly
recounted here.
abs(x)~
Returns the absolute value of {x}.
add(x, y)~
Return the sum of {x} and {y}.
canonical(x)~
Returns the same Decimal object {x}.
compare(x, y)~
Compares {x} and {y} numerically.
compare_signal(x, y)~
Compares the values of the two operands numerically.
compare_total(x, y)~
Compares two operands using their abstract representation.
compare_total_mag(x, y)~
Compares two operands using their abstract representation, ignoring sign.
copy_abs(x)~
Returns a copy of {x} with the sign set to 0.
copy_negate(x)~
Returns a copy of {x} with the sign inverted.
copy_sign(x, y)~
Copies the sign from {y} to {x}.
divide(x, y)~
Return {x} divided by {y}.
divide_int(x, y)~
Return {x} divided by {y}, truncated to an integer.
divmod(x, y)~
Divides two numbers and returns the integer part of the result.
exp(x)~
Returns `e {} x`.
fma(x, y, z)~
Returns {x} multiplied by {y}, plus {z}.
is_canonical(x)~
Returns True if {x} is canonical; otherwise returns False.
is_finite(x)~
Returns True if {x} is finite; otherwise returns False.
is_infinite(x)~
Returns True if {x} is infinite; otherwise returns False.
is_nan(x)~
Returns True if {x} is a qNaN or sNaN; otherwise returns False.
is_normal(x)~
Returns True if {x} is a normal number; otherwise returns False.
is_qnan(x)~
Returns True if {x} is a quiet NaN; otherwise returns False.
is_signed(x)~
Returns True if {x} is negative; otherwise returns False.
is_snan(x)~
Returns True if {x} is a signaling NaN; otherwise returns False.
is_subnormal(x)~
Returns True if {x} is subnormal; otherwise returns False.
is_zero(x)~
Returns True if {x} is a zero; otherwise returns False.
ln(x)~
Returns the natural (base e) logarithm of {x}.
log10(x)~
Returns the base 10 logarithm of {x}.
logb(x)~
Returns the exponent of the magnitude of the operand's MSD.
logical_and(x, y)~
Applies the logical operation {and} between each operand's digits.
logical_invert(x)~
Invert all the digits in {x}.
logical_or(x, y)~
Applies the logical operation {or} between each operand's digits.
logical_xor(x, y)~
Applies the logical operation {xor} between each operand's digits.
max(x, y)~
Compares two values numerically and returns the maximum.
max_mag(x, y)~
Compares the values numerically with their sign ignored.
min(x, y)~
Compares two values numerically and returns the minimum.
min_mag(x, y)~
Compares the values numerically with their sign ignored.
minus(x)~
Minus corresponds to the unary prefix minus operator in Python.
multiply(x, y)~
Return the product of {x} and {y}.
next_minus(x)~
Returns the largest representable number smaller than {x}.
next_plus(x)~
Returns the smallest representable number larger than {x}.
next_toward(x, y)~
Returns the number closest to {x}, in direction towards {y}.
normalize(x)~
Reduces {x} to its simplest form.
number_class(x)~
Returns an indication of the class of {x}.
plus(x)~
Plus corresponds to the unary prefix plus operator in Python. This
operation applies the context precision and rounding, so it is {not} an
identity operation.
power(x, y[, modulo])~
Return ``x`` to the power of ``y``, reduced modulo ``modulo`` if given.
With two arguments, compute ``x{}y``. If ``x`` is negative then ``y``
must be integral. The result will be inexact unless ``y`` is integral and
the result is finite and can be expressed exactly in 'precision' digits.
The result should always be correctly rounded, using the rounding mode of
the current thread's context.
With three arguments, compute ``(x{}y) % modulo``. For the three argument
form, the following restrictions on the arguments hold:
- all three arguments must be integral
- ``y`` must be nonnegative
- at least one of ``x`` or ``y`` must be nonzero
- ``modulo`` must be nonzero and have at most 'precision' digits
The value resulting from ``Context.power(x, y, modulo)`` is
equal to the value that would be obtained by computing ``(x{}y)
% modulo`` with unbounded precision, but is computed more
efficiently. The exponent of the result is zero, regardless of
the exponents of ``x``, ``y`` and ``modulo``. The result is
always exact.
.. versionchanged:: 2.6
``y`` may now be nonintegral in ``x{}y``.
Stricter requirements for the three-argument version.
quantize(x, y)~
Returns a value equal to {x} (rounded), having the exponent of {y}.
radix()~
Just returns 10, as this is Decimal, :)
remainder(x, y)~
Returns the remainder from integer division.
The sign of the result, if non-zero, is the same as that of the original
dividend.
remainder_near(x, y)~
Returns ``x - y { n``, where }n* is the integer nearest the exact value
of ``x / y`` (if the result is 0 then its sign will be the sign of {x}).
rotate(x, y)~
Returns a rotated copy of {x}, {y} times.
same_quantum(x, y)~
Returns True if the two operands have the same exponent.
scaleb (x, y)~
Returns the first operand after adding the second value its exp.
shift(x, y)~
Returns a shifted copy of {x}, {y} times.
sqrt(x)~
Square root of a non-negative number to context precision.
subtract(x, y)~
Return the difference between {x} and {y}.
to_eng_string(x)~
Converts a number to a string, using scientific notation.
to_integral_exact(x)~
Rounds to an integer.
to_sci_string(x)~
Converts a number to a string using scientific notation.
.. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Signals
-------
Signals represent conditions that arise during computation. Each corresponds to
one context flag and one context trap enabler.
The context flag is set whenever the condition is encountered. After the
computation, flags may be checked for informational purposes (for instance, to
determine whether a computation was exact). After checking the flags, be sure to
clear all flags before starting the next computation.
If the context's trap enabler is set for the signal, then the condition causes a
Python exception to be raised. For example, if the DivisionByZero trap
is set, then a DivisionByZero exception is raised upon encountering the
condition.
Clamped~
Altered an exponent to fit representation constraints.
Typically, clamping occurs when an exponent falls outside the context's
Emin and Emax limits. If possible, the exponent is reduced to
fit by adding zeros to the coefficient.
DecimalException~
Base class for other signals and a subclass of ArithmeticError.
DivisionByZero~
Signals the division of a non-infinite number by zero.
Can occur with division, modulo division, or when raising a number to a negative
power. If this signal is not trapped, returns Infinity or
-Infinity with the sign determined by the inputs to the calculation.
Inexact~
Indicates that rounding occurred and the result is not exact.
Signals when non-zero digits were discarded during rounding. The rounded result
is returned. The signal flag or trap is used to detect when results are
inexact.
InvalidOperation~
An invalid operation was performed.
Indicates that an operation was requested that does not make sense. If not
trapped, returns NaN. Possible causes include:: >
Infinity - Infinity
0 * Infinity
Infinity / Infinity
x % 0
Infinity % x
x._rescale( non-integer )
sqrt(-x) and x > 0
0 {} 0
x {} (non-integer)
x {} Infinity
<
Overflow~
Numerical overflow.
Indicates the exponent is larger than Emax after rounding has
occurred. If not trapped, the result depends on the rounding mode, either
pulling inward to the largest representable finite number or rounding outward
to Infinity. In either case, Inexact and Rounded
are also signaled.
Rounded~
Rounding occurred though possibly no information was lost.
Signaled whenever rounding discards digits; even if those digits are zero
(such as rounding 5.00 to 5.0). If not trapped, returns
the result unchanged. This signal is used to detect loss of significant
digits.
Subnormal~
Exponent was lower than Emin prior to rounding.
Occurs when an operation result is subnormal (the exponent is too small). If
not trapped, returns the result unchanged.
Underflow~
Numerical underflow with result rounded to zero.
Occurs when a subnormal result is pushed to zero by rounding. Inexact
and Subnormal are also signaled.
The following table summarizes the hierarchy of signals:: >
exceptions.ArithmeticError(exceptions.StandardError)
DecimalException
Clamped
DivisionByZero(DecimalException, exceptions.ZeroDivisionError)
Inexact
Overflow(Inexact, Rounded)
Underflow(Inexact, Rounded, Subnormal)
InvalidOperation
Rounded
Subnormal
<
.. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Floating Point Notes
--------------------
Mitigating round-off error with increased precision
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The use of decimal floating point eliminates decimal representation error
(making it possible to represent 0.1 exactly); however, some operations
can still incur round-off error when non-zero digits exceed the fixed precision.
The effects of round-off error can be amplified by the addition or subtraction
of nearly offsetting quantities resulting in loss of significance. Knuth
provides two instructive examples where rounded floating point arithmetic with
insufficient precision causes the breakdown of the associative and distributive
properties of addition:
.. doctest:: newcontext
# Examples from Seminumerical Algorithms, Section 4.2.2.
>>> from decimal import Decimal, getcontext
>>> getcontext().prec = 8
>>> u, v, w = Decimal(11111113), Decimal(-11111111), Decimal('7.51111111')
>>> (u + v) + w
Decimal('9.5111111')
>>> u + (v + w)
Decimal('10')
>>> u, v, w = Decimal(20000), Decimal(-6), Decimal('6.0000003')
>>> (u{v) + (u}w)
Decimal('0.01')
>>> u * (v+w)
Decimal('0.0060000')
The decimal (|py2stdlib-decimal|) module makes it possible to restore the identities by
expanding the precision sufficiently to avoid loss of significance:
.. doctest:: newcontext
>>> getcontext().prec = 20
>>> u, v, w = Decimal(11111113), Decimal(-11111111), Decimal('7.51111111')
>>> (u + v) + w
Decimal('9.51111111')
>>> u + (v + w)
Decimal('9.51111111')
>>>
>>> u, v, w = Decimal(20000), Decimal(-6), Decimal('6.0000003')
>>> (u{v) + (u}w)
Decimal('0.0060000')
>>> u * (v+w)
Decimal('0.0060000')
Special values
^^^^^^^^^^^^^^
The number system for the decimal (|py2stdlib-decimal|) module provides special values
including NaN, sNaN, -Infinity, Infinity,
and two zeros, +0 and -0.
Infinities can be constructed directly with: ``Decimal('Infinity')``. Also,
they can arise from dividing by zero when the DivisionByZero signal is
not trapped. Likewise, when the Overflow signal is not trapped, infinity
can result from rounding beyond the limits of the largest representable number.
The infinities are signed (affine) and can be used in arithmetic operations
where they get treated as very large, indeterminate numbers. For instance,
adding a constant to infinity gives another infinite result.
Some operations are indeterminate and return NaN, or if the
InvalidOperation signal is trapped, raise an exception. For example,
``0/0`` returns NaN which means "not a number". This variety of
NaN is quiet and, once created, will flow through other computations
always resulting in another NaN. This behavior can be useful for a
series of computations that occasionally have missing inputs --- it allows the
calculation to proceed while flagging specific results as invalid.
A variant is sNaN which signals rather than remaining quiet after every
operation. This is a useful return value when an invalid result needs to
interrupt a calculation for special handling.
The behavior of Python's comparison operators can be a little surprising where a
NaN is involved. A test for equality where one of the operands is a
quiet or signaling NaN always returns False (even when doing
``Decimal('NaN')==Decimal('NaN')``), while a test for inequality always returns
True. An attempt to compare two Decimals using any of the ``<``,
``<=``, ``>`` or ``>=`` operators will raise the InvalidOperation signal
if either operand is a NaN, and return False if this signal is
not trapped. Note that the General Decimal Arithmetic specification does not
specify the behavior of direct comparisons; these rules for comparisons
involving a NaN were taken from the IEEE 854 standard (see Table 3 in
section 5.7). To ensure strict standards-compliance, use the compare
and compare-signal methods instead.
The signed zeros can result from calculations that underflow. They keep the sign
that would have resulted if the calculation had been carried out to greater
precision. Since their magnitude is zero, both positive and negative zeros are
treated as equal and their sign is informational.
In addition to the two signed zeros which are distinct yet equal, there are
various representations of zero with differing precisions yet equivalent in
value. This takes a bit of getting used to. For an eye accustomed to
normalized floating point representations, it is not immediately obvious that
the following calculation returns a value equal to zero:
>>> 1 / Decimal('Infinity')
Decimal('0E-1000000026')
.. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Working with threads
--------------------
The getcontext function accesses a different Context object for
each thread. Having separate thread contexts means that threads may make
changes (such as ``getcontext.prec=10``) without interfering with other threads.
Likewise, the setcontext function automatically assigns its target to
the current thread.
If setcontext has not been called before getcontext, then
getcontext will automatically create a new context for use in the
current thread.
The new context is copied from a prototype context called {DefaultContext}. To
control the defaults so that each thread will use the same values throughout the
application, directly modify the {DefaultContext} object. This should be done
{before} any threads are started so that there won't be a race condition between
threads calling getcontext. For example:: >
# Set applicationwide defaults for all threads about to be launched
DefaultContext.prec = 12
DefaultContext.rounding = ROUND_DOWN
DefaultContext.traps = ExtendedContext.traps.copy()
DefaultContext.traps[InvalidOperation] = 1
setcontext(DefaultContext)
# Afterwards, the threads can be started
t1.start()
t2.start()
t3.start()
. . .
<
.. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Recipes
-------
Here are a few recipes that serve as utility functions and that demonstrate ways
to work with the Decimal class:: >
def moneyfmt(value, places=2, curr='', sep=',', dp='.',
pos='', neg='-', trailneg=''):
"""Convert Decimal to a money formatted string.
places: required number of places after the decimal point
curr: optional currency symbol before the sign (may be blank)
sep: optional grouping separator (comma, period, space, or blank)
dp: decimal point indicator (comma or period)
only specify as blank when places is zero
pos: optional sign for positive numbers: '+', space or blank
neg: optional sign for negative numbers: '-', '(', space or blank
trailneg:optional trailing minus indicator: '-', ')', space or blank
>>> d = Decimal('-1234567.8901')
>>> moneyfmt(d, curr='$')
'-$1,234,567.89'
>>> moneyfmt(d, places=0, sep='.', dp='', neg='', trailneg='-')
'1.234.568-'
>>> moneyfmt(d, curr='$', neg='(', trailneg=')')
'($1,234,567.89)'
>>> moneyfmt(Decimal(123456789), sep=' ')
'123 456 789.00'
>>> moneyfmt(Decimal('-0.02'), neg='<', trailneg='>')
'<0.02>'
"""
q = Decimal(10) {} -places # 2 places --> '0.01'
sign, digits, exp = value.quantize(q).as_tuple()
result = []
digits = map(str, digits)
build, next = result.append, digits.pop
if sign:
build(trailneg)
for i in range(places):
build(next() if digits else '0')
build(dp)
if not digits:
build('0')
i = 0
while digits:
build(next())
i += 1
if i == 3 and digits:
i = 0
build(sep)
build(curr)
build(neg if sign else pos)
return ''.join(reversed(result))
def pi():
"""Compute Pi to the current precision.
>>> print pi()
3.141592653589793238462643383
"""
getcontext().prec += 2 # extra digits for intermediate steps
three = Decimal(3) # substitute "three=3.0" for regular floats
lasts, t, s, n, na, d, da = 0, three, 3, 1, 0, 0, 24
while s != lasts:
lasts = s
n, na = n+na, na+8
d, da = d+da, da+32
t = (t * n) / d
s += t
getcontext().prec -= 2
return +s # unary plus applies the new precision
def exp(x):
"""Return e raised to the power of x. Result type matches input type.
>>> print exp(Decimal(1))
2.718281828459045235360287471
>>> print exp(Decimal(2))
7.389056098930650227230427461
>>> print exp(2.0)
7.38905609893
>>> print exp(2+0j)
(7.38905609893+0j)
"""
getcontext().prec += 2
i, lasts, s, fact, num = 0, 0, 1, 1, 1
while s != lasts:
lasts = s
i += 1
fact *= i
num *= x
s += num / fact
getcontext().prec -= 2
return +s
def cos(x):
"""Return the cosine of x as measured in radians.
>>> print cos(Decimal('0.5'))
0.8775825618903727161162815826
>>> print cos(0.5)
0.87758256189
>>> print cos(0.5+0j)
(0.87758256189+0j)
"""
getcontext().prec += 2
i, lasts, s, fact, num, sign = 0, 0, 1, 1, 1, 1
while s != lasts:
lasts = s
i += 2
fact {= i } (i-1)
num {= x } x
sign *= -1
s += num / fact * sign
getcontext().prec -= 2
return +s
def sin(x):
"""Return the sine of x as measured in radians.
>>> print sin(Decimal('0.5'))
0.4794255386042030002732879352
>>> print sin(0.5)
0.479425538604
>>> print sin(0.5+0j)
(0.479425538604+0j)
"""
getcontext().prec += 2
i, lasts, s, fact, num, sign = 1, 0, x, 1, x, 1
while s != lasts:
lasts = s
i += 2
fact {= i } (i-1)
num {= x } x
sign *= -1
s += num / fact * sign
getcontext().prec -= 2
return +s
<
.. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Decimal FAQ
-----------
Q. It is cumbersome to type ``decimal.Decimal('1234.5')``. Is there a way to
minimize typing when using the interactive interpreter?
A. Some users abbreviate the constructor to just a single letter:
>>> D = decimal.Decimal
>>> D('1.23') + D('3.45')
Decimal('4.68')
Q. In a fixed-point application with two decimal places, some inputs have many
places and need to be rounded. Others are not supposed to have excess digits
and need to be validated. What methods should be used?
A. The quantize method rounds to a fixed number of decimal places. If
the Inexact trap is set, it is also useful for validation:
>>> TWOPLACES = Decimal(10) {} -2 # same as Decimal('0.01')
>>> # Round to two places
>>> Decimal('3.214').quantize(TWOPLACES)
Decimal('3.21')
>>> # Validate that a number does not exceed two places
>>> Decimal('3.21').quantize(TWOPLACES, context=Context(traps=[Inexact]))
Decimal('3.21')
>>> Decimal('3.214').quantize(TWOPLACES, context=Context(traps=[Inexact]))
Traceback (most recent call last):
...
Inexact: None
Q. Once I have valid two place inputs, how do I maintain that invariant
throughout an application?
A. Some operations like addition, subtraction, and multiplication by an integer
will automatically preserve fixed point. Others operations, like division and
non-integer multiplication, will change the number of decimal places and need to
be followed-up with a quantize step:
>>> a = Decimal('102.72') # Initial fixed-point values
>>> b = Decimal('3.17')
>>> a + b # Addition preserves fixed-point
Decimal('105.89')
>>> a - b
Decimal('99.55')
>>> a * 42 # So does integer multiplication
Decimal('4314.24')
>>> (a * b).quantize(TWOPLACES) # Must quantize non-integer multiplication
Decimal('325.62')
>>> (b / a).quantize(TWOPLACES) # And quantize division
Decimal('0.03')
In developing fixed-point applications, it is convenient to define functions
to handle the quantize step:
>>> def mul(x, y, fp=TWOPLACES):
... return (x * y).quantize(fp)
>>> def div(x, y, fp=TWOPLACES):
... return (x / y).quantize(fp)
>>> mul(a, b) # Automatically preserve fixed-point
Decimal('325.62')
>>> div(b, a)
Decimal('0.03')
Q. There are many ways to express the same value. The numbers 200,
200.000, 2E2, and .02E+4 all have the same value at
various precisions. Is there a way to transform them to a single recognizable
canonical value?
A. The normalize method maps all equivalent values to a single
representative:
>>> values = map(Decimal, '200 200.000 2E2 .02E+4'.split())
>>> [v.normalize() for v in values]
[Decimal('2E+2'), Decimal('2E+2'), Decimal('2E+2'), Decimal('2E+2')]
Q. Some decimal values always print with exponential notation. Is there a way
to get a non-exponential representation?
A. For some values, exponential notation is the only way to express the number
of significant places in the coefficient. For example, expressing
5.0E+3 as 5000 keeps the value constant but cannot show the
original's two-place significance.
If an application does not care about tracking significance, it is easy to
remove the exponent and trailing zeros, losing significance, but keeping the
value unchanged:: >
def remove_exponent(d):
'''Remove exponent and trailing zeros.
>>> remove_exponent(Decimal('5E+3'))
Decimal('5000')
'''
return d.quantize(Decimal(1)) if d == d.to_integral() else d.normalize()
<
Q. Is there a way to convert a regular float to a Decimal?
A. Yes, any binary floating point number can be exactly expressed as a
Decimal though an exact conversion may take more precision than intuition would
suggest:
.. doctest::
>>> Decimal(math.pi)
Decimal('3.141592653589793115997963468544185161590576171875')
Q. Within a complex calculation, how can I make sure that I haven't gotten a
spurious result because of insufficient precision or rounding anomalies.
A. The decimal module makes it easy to test results. A best practice is to
re-run calculations using greater precision and with various rounding modes.
Widely differing results indicate insufficient precision, rounding mode issues,
ill-conditioned inputs, or a numerically unstable algorithm.
Q. I noticed that context precision is applied to the results of operations but
not to the inputs. Is there anything to watch out for when mixing values of
different precisions?
A. Yes. The principle is that all values are considered to be exact and so is
the arithmetic on those values. Only the results are rounded. The advantage
for inputs is that "what you type is what you get". A disadvantage is that the
results can look odd if you forget that the inputs haven't been rounded:
.. doctest:: newcontext
>>> getcontext().prec = 3
>>> Decimal('3.104') + Decimal('2.104')
Decimal('5.21')
>>> Decimal('3.104') + Decimal('0.000') + Decimal('2.104')
Decimal('5.20')
The solution is either to increase precision or to force rounding of inputs
using the unary plus operation:
.. doctest:: newcontext
>>> getcontext().prec = 3
>>> +Decimal('1.23456789') # unary plus triggers rounding
Decimal('1.23')
Alternatively, inputs can be rounded upon creation using the
Context.create_decimal method:
>>> Context(prec=5, rounding=ROUND_DOWN).create_decimal('1.2345678')
Decimal('1.2345')
==============================================================================
*py2stdlib-difflib*
difflib~
:synopsis: Helpers for computing differences between objects.
.. Markup by Fred L. Drake, Jr. <fdrake@acm.org>
.. testsetup::
import sys
from difflib import *
.. versionadded:: 2.1
This module provides classes and functions for comparing sequences. It
can be used for example, for comparing files, and can produce difference
information in various formats, including HTML and context and unified
diffs. For comparing directories and files, see also, the filecmp (|py2stdlib-filecmp|) module.
SequenceMatcher~
This is a flexible class for comparing pairs of sequences of any type, so long
as the sequence elements are hashable. The basic algorithm predates, and is a
little fancier than, an algorithm published in the late 1980's by Ratcliff and
Obershelp under the hyperbolic name "gestalt pattern matching." The idea is to
find the longest contiguous matching subsequence that contains no "junk"
elements (the Ratcliff and Obershelp algorithm doesn't address junk). The same
idea is then applied recursively to the pieces of the sequences to the left and
to the right of the matching subsequence. This does not yield minimal edit
sequences, but does tend to yield matches that "look right" to people.
{Timing:}* The basic Ratcliff-Obershelp algorithm is cubic time in the worst
case and quadratic time in the expected case. SequenceMatcher is
quadratic time for the worst case and has expected-case behavior dependent in a
complicated way on how many elements the sequences have in common; best case
time is linear.
Differ~
This is a class for comparing sequences of lines of text, and producing
human-readable differences or deltas. Differ uses SequenceMatcher
both to compare sequences of lines, and to compare sequences of characters
within similar (near-matching) lines.
Each line of a Differ delta begins with a two-letter code:
+----------+-------------------------------------------+
| Code | Meaning |
+==========+===========================================+
| ``'- '`` | line unique to sequence 1 |
+----------+-------------------------------------------+
| ``'+ '`` | line unique to sequence 2 |
+----------+-------------------------------------------+
| ``' '`` | line common to both sequences |
+----------+-------------------------------------------+
| ``'? '`` | line not present in either input sequence |
+----------+-------------------------------------------+
Lines beginning with '``?``' attempt to guide the eye to intraline differences,
and were not present in either input sequence. These lines can be confusing if
the sequences contain tab characters.
HtmlDiff~
This class can be used to create an HTML table (or a complete HTML file
containing the table) showing a side by side, line by line comparison of text
with inter-line and intra-line change highlights. The table can be generated in
either full or contextual difference mode.
The constructor for this class is:
.. function:: __init__([tabsize][, wrapcolumn][, linejunk][, charjunk])
Initializes instance of HtmlDiff.
{tabsize} is an optional keyword argument to specify tab stop spacing and
defaults to ``8``.
{wrapcolumn} is an optional keyword to specify column number where lines are
broken and wrapped, defaults to ``None`` where lines are not wrapped.
{linejunk} and {charjunk} are optional keyword arguments passed into ``ndiff()``
(used by HtmlDiff to generate the side by side HTML differences). See
``ndiff()`` documentation for argument default values and descriptions.
The following methods are public:
.. function:: make_file(fromlines, tolines [, fromdesc][, todesc][, context][, numlines])
Compares {fromlines} and {tolines} (lists of strings) and returns a string which
is a complete HTML file containing a table showing line by line differences with
inter-line and intra-line changes highlighted.
{fromdesc} and {todesc} are optional keyword arguments to specify from/to file
column header strings (both default to an empty string).
{context} and {numlines} are both optional keyword arguments. Set {context} to
``True`` when contextual differences are to be shown, else the default is
``False`` to show the full files. {numlines} defaults to ``5``. When {context}
is ``True`` {numlines} controls the number of context lines which surround the
difference highlights. When {context} is ``False`` {numlines} controls the
number of lines which are shown before a difference highlight when using the
"next" hyperlinks (setting to zero would cause the "next" hyperlinks to place
the next difference highlight at the top of the browser without any leading
context).
.. function:: make_table(fromlines, tolines [, fromdesc][, todesc][, context][, numlines])
Compares {fromlines} and {tolines} (lists of strings) and returns a string which
is a complete HTML table showing line by line differences with inter-line and
intra-line changes highlighted.
The arguments for this method are the same as those for the make_file
method.
Tools/scripts/diff.py is a command-line front-end to this class and
contains a good example of its use.
.. versionadded:: 2.4
context_diff(a, b[, fromfile][, tofile][, fromfiledate][, tofiledate][, n][, lineterm])~
Compare {a} and {b} (lists of strings); return a delta (a generator
generating the delta lines) in context diff format.
Context diffs are a compact way of showing just the lines that have changed plus
a few lines of context. The changes are shown in a before/after style. The
number of context lines is set by {n} which defaults to three.
By default, the diff control lines (those with ``{}`` or ``---``) are created
with a trailing newline. This is helpful so that inputs created from
file.readlines result in diffs that are suitable for use with
file.writelines since both the inputs and outputs have trailing
newlines.
For inputs that do not have trailing newlines, set the {lineterm} argument to
``""`` so that the output will be uniformly newline free.
The context diff format normally has a header for filenames and modification
times. Any or all of these may be specified using strings for {fromfile},
{tofile}, {fromfiledate}, and {tofiledate}. The modification times are normally
expressed in the ISO 8601 format. If not specified, the
strings default to blanks.
>>> s1 = ['bacon\n', 'eggs\n', 'ham\n', 'guido\n']
>>> s2 = ['python\n', 'eggy\n', 'hamster\n', 'guido\n']
>>> for line in context_diff(s1, s2, fromfile='before.py', tofile='after.py'):
... sys.stdout.write(line) # doctest: +NORMALIZE_WHITESPACE
{} before.py
--- after.py
******************
{}
{ 1,4 }{}
! bacon
! eggs
! ham
guido
--- 1,4 ----
! python
! eggy
! hamster
guido
See difflib-interface for a more detailed example.
.. versionadded:: 2.3
get_close_matches(word, possibilities[, n][, cutoff])~
Return a list of the best "good enough" matches. {word} is a sequence for which
close matches are desired (typically a string), and {possibilities} is a list of
sequences against which to match {word} (typically a list of strings).
Optional argument {n} (default ``3``) is the maximum number of close matches to
return; {n} must be greater than ``0``.
Optional argument {cutoff} (default ``0.6``) is a float in the range [0, 1].
Possibilities that don't score at least that similar to {word} are ignored.
The best (no more than {n}) matches among the possibilities are returned in a
list, sorted by similarity score, most similar first.
>>> get_close_matches('appel', ['ape', 'apple', 'peach', 'puppy'])
['apple', 'ape']
>>> import keyword
>>> get_close_matches('wheel', keyword.kwlist)
['while']
>>> get_close_matches('apple', keyword.kwlist)
[]
>>> get_close_matches('accept', keyword.kwlist)
['except']
ndiff(a, b[, linejunk][, charjunk])~
Compare {a} and {b} (lists of strings); return a Differ\ -style
delta (a generator generating the delta lines).
Optional keyword parameters {linejunk} and {charjunk} are for filter functions
(or ``None``):
{linejunk}: A function that accepts a single string argument, and returns true
if the string is junk, or false if not. The default is (``None``), starting with
Python 2.3. Before then, the default was the module-level function
IS_LINE_JUNK, which filters out lines without visible characters, except
for at most one pound character (``'#'``). As of Python 2.3, the underlying
SequenceMatcher class does a dynamic analysis of which lines are so
frequent as to constitute noise, and this usually works better than the pre-2.3
default.
{charjunk}: A function that accepts a character (a string of length 1), and
returns if the character is junk, or false if not. The default is module-level
function IS_CHARACTER_JUNK, which filters out whitespace characters (a
blank or tab; note: bad idea to include newline in this!).
Tools/scripts/ndiff.py is a command-line front-end to this function.
>>> diff = ndiff('one\ntwo\nthree\n'.splitlines(1),
... 'ore\ntree\nemu\n'.splitlines(1))
>>> print ''.join(diff),
- one
? ^
+ ore
? ^
- two
- three
? -
+ tree
+ emu
restore(sequence, which)~
Return one of the two sequences that generated a delta.
Given a {sequence} produced by Differ.compare or ndiff, extract
lines originating from file 1 or 2 (parameter {which}), stripping off line
prefixes.
Example:
>>> diff = ndiff('one\ntwo\nthree\n'.splitlines(1),
... 'ore\ntree\nemu\n'.splitlines(1))
>>> diff = list(diff) # materialize the generated delta into a list
>>> print ''.join(restore(diff, 1)),
one
two
three
>>> print ''.join(restore(diff, 2)),
ore
tree
emu
unified_diff(a, b[, fromfile][, tofile][, fromfiledate][, tofiledate][, n][, lineterm])~
Compare {a} and {b} (lists of strings); return a delta (a generator
generating the delta lines) in unified diff format.
Unified diffs are a compact way of showing just the lines that have changed plus
a few lines of context. The changes are shown in a inline style (instead of
separate before/after blocks). The number of context lines is set by {n} which
defaults to three.
By default, the diff control lines (those with ``---``, ``+++``, or ``@@``) are
created with a trailing newline. This is helpful so that inputs created from
file.readlines result in diffs that are suitable for use with
file.writelines since both the inputs and outputs have trailing
newlines.
For inputs that do not have trailing newlines, set the {lineterm} argument to
``""`` so that the output will be uniformly newline free.
The context diff format normally has a header for filenames and modification
times. Any or all of these may be specified using strings for {fromfile},
{tofile}, {fromfiledate}, and {tofiledate}. The modification times are normally
expressed in the ISO 8601 format. If not specified, the
strings default to blanks.
>>> s1 = ['bacon\n', 'eggs\n', 'ham\n', 'guido\n']
>>> s2 = ['python\n', 'eggy\n', 'hamster\n', 'guido\n']
>>> for line in unified_diff(s1, s2, fromfile='before.py', tofile='after.py'):
... sys.stdout.write(line) # doctest: +NORMALIZE_WHITESPACE
--- before.py
+++ after.py
@@ -1,4 +1,4 @@
-bacon
-eggs
-ham
+python
+eggy
+hamster
guido
See difflib-interface for a more detailed example.
.. versionadded:: 2.3
IS_LINE_JUNK(line)~
Return true for ignorable lines. The line {line} is ignorable if {line} is
blank or contains a single ``'#'``, otherwise it is not ignorable. Used as a
default for parameter {linejunk} in ndiff before Python 2.3.
IS_CHARACTER_JUNK(ch)~
Return true for ignorable characters. The character {ch} is ignorable if {ch}
is a space or tab, otherwise it is not ignorable. Used as a default for
parameter {charjunk} in ndiff.
.. seealso::
`Pattern Matching: The Gestalt Approach <http://www.ddj.com/184407970?pgno=5>`_
Discussion of a similar algorithm by John W. Ratcliff and D. E. Metzener. This
was published in `Dr. Dobb's Journal <http://www.ddj.com/>`_ in July, 1988.
SequenceMatcher Objects
-----------------------
The SequenceMatcher class has this constructor:
SequenceMatcher([isjunk[, a[, b]]])~
Optional argument {isjunk} must be ``None`` (the default) or a one-argument
function that takes a sequence element and returns true if and only if the
element is "junk" and should be ignored. Passing ``None`` for {isjunk} is
equivalent to passing ``lambda x: 0``; in other words, no elements are ignored.
For example, pass:: >
lambda x: x in " \t"
<
if you're comparing lines as sequences of characters, and don't want to synch up
on blanks or hard tabs.
The optional arguments {a} and {b} are sequences to be compared; both default to
empty strings. The elements of both sequences must be hashable.
SequenceMatcher objects have the following methods:
set_seqs(a, b)~
Set the two sequences to be compared.
SequenceMatcher computes and caches detailed information about the
second sequence, so if you want to compare one sequence against many
sequences, use set_seq2 to set the commonly used sequence once and
call set_seq1 repeatedly, once for each of the other sequences.
set_seq1(a)~
Set the first sequence to be compared. The second sequence to be compared
is not changed.
set_seq2(b)~
Set the second sequence to be compared. The first sequence to be compared
is not changed.
find_longest_match(alo, ahi, blo, bhi)~
Find longest matching block in ``a[alo:ahi]`` and ``b[blo:bhi]``.
If {isjunk} was omitted or ``None``, find_longest_match returns
``(i, j, k)`` such that ``a[i:i+k]`` is equal to ``b[j:j+k]``, where ``alo
<= i <= i+k <= ahi`` and ``blo <= j <= j+k <= bhi``. For all ``(i', j',
k')`` meeting those conditions, the additional conditions ``k >= k'``, ``i
<= i'``, and if ``i == i'``, ``j <= j'`` are also met. In other words, of
all maximal matching blocks, return one that starts earliest in {a}, and
of all those maximal matching blocks that start earliest in {a}, return
the one that starts earliest in {b}.
>>> s = SequenceMatcher(None, " abcd", "abcd abcd")
>>> s.find_longest_match(0, 5, 0, 9)
Match(a=0, b=4, size=5)
If {isjunk} was provided, first the longest matching block is determined
as above, but with the additional restriction that no junk element appears
in the block. Then that block is extended as far as possible by matching
(only) junk elements on both sides. So the resulting block never matches
on junk except as identical junk happens to be adjacent to an interesting
match.
Here's the same example as before, but considering blanks to be junk. That
prevents ``' abcd'`` from matching the ``' abcd'`` at the tail end of the
second sequence directly. Instead only the ``'abcd'`` can match, and
matches the leftmost ``'abcd'`` in the second sequence:
>>> s = SequenceMatcher(lambda x: x==" ", " abcd", "abcd abcd")
>>> s.find_longest_match(0, 5, 0, 9)
Match(a=1, b=0, size=4)
If no blocks match, this returns ``(alo, blo, 0)``.
.. versionchanged:: 2.6
This method returns a named tuple ``Match(a, b, size)``.
get_matching_blocks()~
Return list of triples describing matching subsequences. Each triple is of
the form ``(i, j, n)``, and means that ``a[i:i+n] == b[j:j+n]``. The
triples are monotonically increasing in {i} and {j}.
The last triple is a dummy, and has the value ``(len(a), len(b), 0)``. It
is the only triple with ``n == 0``. If ``(i, j, n)`` and ``(i', j', n')``
are adjacent triples in the list, and the second is not the last triple in
the list, then ``i+n != i'`` or ``j+n != j'``; in other words, adjacent
triples always describe non-adjacent equal blocks.
.. XXX Explain why a dummy is used!
.. versionchanged:: 2.5
The guarantee that adjacent triples always describe non-adjacent blocks
was implemented.
.. doctest:: >
>>> s = SequenceMatcher(None, "abxcd", "abcd")
>>> s.get_matching_blocks()
[Match(a=0, b=0, size=2), Match(a=3, b=2, size=2), Match(a=5, b=4, size=0)]
<
get_opcodes()~
Return list of 5-tuples describing how to turn {a} into {b}. Each tuple is
of the form ``(tag, i1, i2, j1, j2)``. The first tuple has ``i1 == j1 ==
0``, and remaining tuples have {i1} equal to the {i2} from the preceding
tuple, and, likewise, {j1} equal to the previous {j2}.
The {tag} values are strings, with these meanings:
+---------------+---------------------------------------------+
| Value | Meaning |
+===============+=============================================+
| ``'replace'`` | ``a[i1:i2]`` should be replaced by |
| | ``b[j1:j2]``. |
+---------------+---------------------------------------------+
| ``'delete'`` | ``a[i1:i2]`` should be deleted. Note that |
| | ``j1 == j2`` in this case. |
+---------------+---------------------------------------------+
| ``'insert'`` | ``b[j1:j2]`` should be inserted at |
| | ``a[i1:i1]``. Note that ``i1 == i2`` in |
| | this case. |
+---------------+---------------------------------------------+
| ``'equal'`` | ``a[i1:i2] == b[j1:j2]`` (the sub-sequences |
| | are equal). |
+---------------+---------------------------------------------+
For example:
>>> a = "qabxcd"
>>> b = "abycdf"
>>> s = SequenceMatcher(None, a, b)
>>> for tag, i1, i2, j1, j2 in s.get_opcodes():
... print ("%7s a[%d:%d] (%s) b[%d:%d] (%s)" %
... (tag, i1, i2, a[i1:i2], j1, j2, b[j1:j2]))
delete a[0:1] (q) b[0:0] ()
equal a[1:3] (ab) b[0:2] (ab)
replace a[3:4] (x) b[2:3] (y)
equal a[4:6] (cd) b[3:5] (cd)
insert a[6:6] () b[5:6] (f)
get_grouped_opcodes([n])~
Return a generator of groups with up to {n} lines of context.
Starting with the groups returned by get_opcodes, this method
splits out smaller change clusters and eliminates intervening ranges which
have no changes.
The groups are returned in the same format as get_opcodes.
.. versionadded:: 2.3
ratio()~
Return a measure of the sequences' similarity as a float in the range [0,
1].
Where T is the total number of elements in both sequences, and M is the
number of matches, this is 2.0\*M / T. Note that this is ``1.0`` if the
sequences are identical, and ``0.0`` if they have nothing in common.
This is expensive to compute if get_matching_blocks or
get_opcodes hasn't already been called, in which case you may want
to try quick_ratio or real_quick_ratio first to get an
upper bound.
quick_ratio()~
Return an upper bound on ratio relatively quickly.
This isn't defined beyond that it is an upper bound on ratio, and
is faster to compute.
real_quick_ratio()~
Return an upper bound on ratio very quickly.
This isn't defined beyond that it is an upper bound on ratio, and
is faster to compute than either ratio or quick_ratio.
The three methods that return the ratio of matching to total characters can give
different results due to differing levels of approximation, although
quick_ratio and real_quick_ratio are always at least as large as
ratio:
>>> s = SequenceMatcher(None, "abcd", "bcde")
>>> s.ratio()
0.75
>>> s.quick_ratio()
0.75
>>> s.real_quick_ratio()
1.0
SequenceMatcher Examples
------------------------
This example compares two strings, considering blanks to be "junk:"
>>> s = SequenceMatcher(lambda x: x == " ",
... "private Thread currentThread;",
... "private volatile Thread currentThread;")
ratio returns a float in [0, 1], measuring the similarity of the
sequences. As a rule of thumb, a ratio value over 0.6 means the
sequences are close matches:
>>> print round(s.ratio(), 3)
0.866
If you're only interested in where the sequences match,
get_matching_blocks is handy:
>>> for block in s.get_matching_blocks():
... print "a[%d] and b[%d] match for %d elements" % block
a[0] and b[0] match for 8 elements
a[8] and b[17] match for 21 elements
a[29] and b[38] match for 0 elements
Note that the last tuple returned by get_matching_blocks is always a
dummy, ``(len(a), len(b), 0)``, and this is the only case in which the last
tuple element (number of elements matched) is ``0``.
If you want to know how to change the first sequence into the second, use
get_opcodes:
>>> for opcode in s.get_opcodes():
... print "%6s a[%d:%d] b[%d:%d]" % opcode
equal a[0:8] b[0:8]
insert a[8:8] b[8:17]
equal a[8:29] b[17:38]
.. seealso::
* The get_close_matches function in this module which shows how
simple code building on SequenceMatcher can be used to do useful
work.
* `Simple version control recipe
<http://code.activestate.com/recipes/576729/>`_ for a small application
built with SequenceMatcher.
Differ Objects
--------------
Note that Differ\ -generated deltas make no claim to be {minimal}*
diffs. To the contrary, minimal diffs are often counter-intuitive, because they
synch up anywhere possible, sometimes accidental matches 100 pages apart.
Restricting synch points to contiguous matches preserves some notion of
locality, at the occasional cost of producing a longer diff.
The Differ class has this constructor:
Differ([linejunk[, charjunk]])~
Optional keyword parameters {linejunk} and {charjunk} are for filter functions
(or ``None``):
{linejunk}: A function that accepts a single string argument, and returns true
if the string is junk. The default is ``None``, meaning that no line is
considered junk.
{charjunk}: A function that accepts a single character argument (a string of
length 1), and returns true if the character is junk. The default is ``None``,
meaning that no character is considered junk.
Differ objects are used (deltas generated) via a single method:
Differ.compare(a, b)~
Compare two sequences of lines, and generate the delta (a sequence of lines).
Each sequence must contain individual single-line strings ending with newlines.
Such sequences can be obtained from the readlines method of file-like
objects. The delta generated also consists of newline-terminated strings, ready
to be printed as-is via the writelines method of a file-like object.
Differ Example
--------------
This example compares two texts. First we set up the texts, sequences of
individual single-line strings ending with newlines (such sequences can also be
obtained from the readlines method of file-like objects):
>>> text1 = ''' 1. Beautiful is better than ugly.
... 2. Explicit is better than implicit.
... 3. Simple is better than complex.
... 4. Complex is better than complicated.
... '''.splitlines(1)
>>> len(text1)
4
>>> text1[0][-1]
'\n'
>>> text2 = ''' 1. Beautiful is better than ugly.
... 3. Simple is better than complex.
... 4. Complicated is better than complex.
... 5. Flat is better than nested.
... '''.splitlines(1)
Next we instantiate a Differ object:
>>> d = Differ()
Note that when instantiating a Differ object we may pass functions to
filter out line and character "junk." See the Differ constructor for
details.
Finally, we compare the two:
>>> result = list(d.compare(text1, text2))
``result`` is a list of strings, so let's pretty-print it:
>>> from pprint import pprint
>>> pprint(result)
[' 1. Beautiful is better than ugly.\n',
'- 2. Explicit is better than implicit.\n',
'- 3. Simple is better than complex.\n',
'+ 3. Simple is better than complex.\n',
'? ++\n',
'- 4. Complex is better than complicated.\n',
'? ^ ---- ^\n',
'+ 4. Complicated is better than complex.\n',
'? ++++ ^ ^\n',
'+ 5. Flat is better than nested.\n']
As a single multi-line string it looks like this:
>>> import sys
>>> sys.stdout.writelines(result)
1. Beautiful is better than ugly.
- 2. Explicit is better than implicit.
- 3. Simple is better than complex.
+ 3. Simple is better than complex.
? ++
- 4. Complex is better than complicated.
? ^ ---- ^
+ 4. Complicated is better than complex.
? ++++ ^ ^
+ 5. Flat is better than nested.
A command-line interface to difflib
-----------------------------------
This example shows how to use difflib to create a ``diff``-like utility.
It is also contained in the Python source distribution, as
Tools/scripts/diff.py.
.. testcode::
""" Command line interface to difflib.py providing diffs in four formats:
* ndiff: lists every line and highlights interline changes.
* context: highlights clusters of changes in a before/after format.
* unified: highlights clusters of changes in an inline format.
* html: generates side by side comparison with change highlights.
"""
import sys, os, time, difflib, optparse
def main():
# Configure the option parser
usage = "usage: %prog [options] fromfile tofile"
parser = optparse.OptionParser(usage)
parser.add_option("-c", action="store_true", default=False,
help='Produce a context format diff (default)')
parser.add_option("-u", action="store_true", default=False,
help='Produce a unified format diff')
hlp = 'Produce HTML side by side diff (can use -c and -l in conjunction)'
parser.add_option("-m", action="store_true", default=False, help=hlp)
parser.add_option("-n", action="store_true", default=False,
help='Produce a ndiff format diff')
parser.add_option("-l", "--lines", type="int", default=3,
help='Set number of context lines (default 3)')
(options, args) = parser.parse_args()
if len(args) == 0:
parser.print_help()
sys.exit(1)
if len(args) != 2:
parser.error("need to specify both a fromfile and tofile")
n = options.lines
fromfile, tofile = args # as specified in the usage string
# we're passing these as arguments to the diff function
fromdate = time.ctime(os.stat(fromfile).st_mtime)
todate = time.ctime(os.stat(tofile).st_mtime)
fromlines = open(fromfile, 'U').readlines()
tolines = open(tofile, 'U').readlines()
if options.u:
diff = difflib.unified_diff(fromlines, tolines, fromfile, tofile,
fromdate, todate, n=n)
elif options.n:
diff = difflib.ndiff(fromlines, tolines)
elif options.m:
diff = difflib.HtmlDiff().make_file(fromlines, tolines, fromfile,
tofile, context=options.c,
numlines=n)
else:
diff = difflib.context_diff(fromlines, tolines, fromfile, tofile,
fromdate, todate, n=n)
# we're using writelines because diff is a generator
sys.stdout.writelines(diff)
if __name__ == '__main__':
main()
==============================================================================
*py2stdlib-dircache*
dircache~
:synopsis: Return directory listing, with cache mechanism.
:deprecated:
2.6~
The dircache (|py2stdlib-dircache|) module has been removed in Python 3.0.
The dircache (|py2stdlib-dircache|) module defines a function for reading directory listing
using a cache, and cache invalidation using the {mtime} of the directory.
Additionally, it defines a function to annotate directories by appending a
slash.
The dircache (|py2stdlib-dircache|) module defines the following functions:
reset()~
Resets the directory cache.
listdir(path)~
Return a directory listing of {path}, as gotten from os.listdir. Note
that unless {path} changes, further call to listdir will not re-read the
directory structure.
Note that the list returned should be regarded as read-only. (Perhaps a future
version should change it to return a tuple?)
opendir(path)~
Same as listdir. Defined for backwards compatibility.
annotate(head, list)~
Assume {list} is a list of paths relative to {head}, and append, in place, a
``'/'`` to each path which points to a directory.
:: >
>>> import dircache
>>> a = dircache.listdir('/')
>>> a = a[:] # Copy the return value so we can change 'a'
>>> a
['bin', 'boot', 'cdrom', 'dev', 'etc', 'floppy', 'home', 'initrd', 'lib', 'lost+
found', 'mnt', 'proc', 'root', 'sbin', 'tmp', 'usr', 'var', 'vmlinuz']
>>> dircache.annotate('/', a)
>>> a
['bin/', 'boot/', 'cdrom/', 'dev/', 'etc/', 'floppy/', 'home/', 'initrd/', 'lib/
', 'lost+found/', 'mnt/', 'proc/', 'root/', 'sbin/', 'tmp/', 'usr/', 'var/', 'vm
linuz']
==============================================================================
*py2stdlib-dis*
dis~
:synopsis: Disassembler for Python bytecode.
The dis (|py2stdlib-dis|) module supports the analysis of Python bytecode by disassembling
it. Since there is no Python assembler, this module defines the Python assembly
language. The Python bytecode which this module takes as an input is defined
in the file Include/opcode.h and used by the compiler and the
interpreter.
Example: Given the function myfunc:: >
def myfunc(alist):
return len(alist)
<
the following command can be used to get the disassembly of myfunc::
>>> dis.dis(myfunc)
2 0 LOAD_GLOBAL 0 (len)
3 LOAD_FAST 0 (alist)
6 CALL_FUNCTION 1
9 RETURN_VALUE
(The "2" is a line number).
The dis (|py2stdlib-dis|) module defines the following functions and constants:
dis([bytesource])~
Disassemble the {bytesource} object. {bytesource} can denote either a module, a
class, a method, a function, or a code object. For a module, it disassembles
all functions. For a class, it disassembles all methods. For a single code
sequence, it prints one line per bytecode instruction. If no object is
provided, it disassembles the last traceback.
distb([tb])~
Disassembles the top-of-stack function of a traceback, using the last traceback
if none was passed. The instruction causing the exception is indicated.
disassemble(code[, lasti])~
Disassembles a code object, indicating the last instruction if {lasti} was
provided. The output is divided in the following columns:
#. the line number, for the first instruction of each line
#. the current instruction, indicated as ``-->``,
#. a labelled instruction, indicated with ``>>``,
#. the address of the instruction,
#. the operation code name,
#. operation parameters, and
#. interpretation of the parameters in parentheses.
The parameter interpretation recognizes local and global variable names,
constant values, branch targets, and compare operators.
disco(code[, lasti])~
A synonym for disassemble. It is more convenient to type, and kept
for compatibility with earlier Python releases.
findlinestarts(code)~
This generator function uses the ``co_firstlineno`` and ``co_lnotab``
attributes of the code object {code} to find the offsets which are starts of
lines in the source code. They are generated as ``(offset, lineno)`` pairs.
findlabels(code)~
Detect all offsets in the code object {code} which are jump targets, and
return a list of these offsets.
opname~
Sequence of operation names, indexable using the bytecode.
opmap~
Dictionary mapping bytecodes to operation names.
cmp_op~
Sequence of all compare operation names.
hasconst~
Sequence of bytecodes that have a constant parameter.
hasfree~
Sequence of bytecodes that access a free variable.
hasname~
Sequence of bytecodes that access an attribute by name.
hasjrel~
Sequence of bytecodes that have a relative jump target.
hasjabs~
Sequence of bytecodes that have an absolute jump target.
haslocal~
Sequence of bytecodes that access a local variable.
hascompare~
Sequence of bytecodes of Boolean operations.
Python Bytecode Instructions
----------------------------
The Python compiler currently generates the following bytecode instructions.
STOP_CODE ()~
Indicates end-of-code to the compiler, not used by the interpreter.
NOP ()~
Do nothing code. Used as a placeholder by the bytecode optimizer.
POP_TOP ()~
Removes the top-of-stack (TOS) item.
ROT_TWO ()~
Swaps the two top-most stack items.
ROT_THREE ()~
Lifts second and third stack item one position up, moves top down to position
three.
ROT_FOUR ()~
Lifts second, third and forth stack item one position up, moves top down to
position four.
DUP_TOP ()~
Duplicates the reference on top of the stack.
Unary Operations take the top of the stack, apply the operation, and push the
result back on the stack.
UNARY_POSITIVE ()~
Implements ``TOS = +TOS``.
UNARY_NEGATIVE ()~
Implements ``TOS = -TOS``.
UNARY_NOT ()~
Implements ``TOS = not TOS``.
UNARY_CONVERT ()~
Implements ``TOS = `TOS```.
UNARY_INVERT ()~
Implements ``TOS = ~TOS``.
GET_ITER ()~
Implements ``TOS = iter(TOS)``.
Binary operations remove the top of the stack (TOS) and the second top-most
stack item (TOS1) from the stack. They perform the operation, and put the
result back on the stack.
BINARY_POWER ()~
Implements ``TOS = TOS1 {} TOS``.
BINARY_MULTIPLY ()~
Implements ``TOS = TOS1 * TOS``.
BINARY_DIVIDE ()~
Implements ``TOS = TOS1 / TOS`` when ``from __future__ import division`` is not
in effect.
BINARY_FLOOR_DIVIDE ()~
Implements ``TOS = TOS1 // TOS``.
BINARY_TRUE_DIVIDE ()~
Implements ``TOS = TOS1 / TOS`` when ``from __future__ import division`` is in
effect.
BINARY_MODULO ()~
Implements ``TOS = TOS1 % TOS``.
BINARY_ADD ()~
Implements ``TOS = TOS1 + TOS``.
BINARY_SUBTRACT ()~
Implements ``TOS = TOS1 - TOS``.
BINARY_SUBSCR ()~
Implements ``TOS = TOS1[TOS]``.
BINARY_LSHIFT ()~
Implements ``TOS = TOS1 << TOS``.
BINARY_RSHIFT ()~
Implements ``TOS = TOS1 >> TOS``.
BINARY_AND ()~
Implements ``TOS = TOS1 & TOS``.
BINARY_XOR ()~
Implements ``TOS = TOS1 ^ TOS``.
BINARY_OR ()~
Implements ``TOS = TOS1 | TOS``.
In-place operations are like binary operations, in that they remove TOS and
TOS1, and push the result back on the stack, but the operation is done in-place
when TOS1 supports it, and the resulting TOS may be (but does not have to be)
the original TOS1.
INPLACE_POWER ()~
Implements in-place ``TOS = TOS1 {} TOS``.
INPLACE_MULTIPLY ()~
Implements in-place ``TOS = TOS1 * TOS``.
INPLACE_DIVIDE ()~
Implements in-place ``TOS = TOS1 / TOS`` when ``from __future__ import
division`` is not in effect.
INPLACE_FLOOR_DIVIDE ()~
Implements in-place ``TOS = TOS1 // TOS``.
INPLACE_TRUE_DIVIDE ()~
Implements in-place ``TOS = TOS1 / TOS`` when ``from __future__ import
division`` is in effect.
INPLACE_MODULO ()~
Implements in-place ``TOS = TOS1 % TOS``.
INPLACE_ADD ()~
Implements in-place ``TOS = TOS1 + TOS``.
INPLACE_SUBTRACT ()~
Implements in-place ``TOS = TOS1 - TOS``.
INPLACE_LSHIFT ()~
Implements in-place ``TOS = TOS1 << TOS``.
INPLACE_RSHIFT ()~
Implements in-place ``TOS = TOS1 >> TOS``.
INPLACE_AND ()~
Implements in-place ``TOS = TOS1 & TOS``.
INPLACE_XOR ()~
Implements in-place ``TOS = TOS1 ^ TOS``.
INPLACE_OR ()~
Implements in-place ``TOS = TOS1 | TOS``.
The slice opcodes take up to three parameters.
SLICE+0 ()~
Implements ``TOS = TOS[:]``.
SLICE+1 ()~
Implements ``TOS = TOS1[TOS:]``.
SLICE+2 ()~
Implements ``TOS = TOS1[:TOS]``.
SLICE+3 ()~
Implements ``TOS = TOS2[TOS1:TOS]``.
Slice assignment needs even an additional parameter. As any statement, they put
nothing on the stack.
STORE_SLICE+0 ()~
Implements ``TOS[:] = TOS1``.
STORE_SLICE+1 ()~
Implements ``TOS1[TOS:] = TOS2``.
STORE_SLICE+2 ()~
Implements ``TOS1[:TOS] = TOS2``.
STORE_SLICE+3 ()~
Implements ``TOS2[TOS1:TOS] = TOS3``.
DELETE_SLICE+0 ()~
Implements ``del TOS[:]``.
DELETE_SLICE+1 ()~
Implements ``del TOS1[TOS:]``.
DELETE_SLICE+2 ()~
Implements ``del TOS1[:TOS]``.
DELETE_SLICE+3 ()~
Implements ``del TOS2[TOS1:TOS]``.
STORE_SUBSCR ()~
Implements ``TOS1[TOS] = TOS2``.
DELETE_SUBSCR ()~
Implements ``del TOS1[TOS]``.
Miscellaneous opcodes.
PRINT_EXPR ()~
Implements the expression statement for the interactive mode. TOS is removed
from the stack and printed. In non-interactive mode, an expression statement is
terminated with ``POP_STACK``.
PRINT_ITEM ()~
Prints TOS to the file-like object bound to ``sys.stdout``. There is one such
instruction for each item in the print statement.
PRINT_ITEM_TO ()~
Like ``PRINT_ITEM``, but prints the item second from TOS to the file-like object
at TOS. This is used by the extended print statement.
PRINT_NEWLINE ()~
Prints a new line on ``sys.stdout``. This is generated as the last operation of
a print statement, unless the statement ends with a comma.
PRINT_NEWLINE_TO ()~
Like ``PRINT_NEWLINE``, but prints the new line on the file-like object on the
TOS. This is used by the extended print statement.
BREAK_LOOP ()~
Terminates a loop due to a break statement.
CONTINUE_LOOP (target)~
Continues a loop due to a continue statement. {target} is the
address to jump to (which should be a ``FOR_ITER`` instruction).
LIST_APPEND (i)~
Calls ``list.append(TOS[-i], TOS)``. Used to implement list comprehensions.
While the appended value is popped off, the list object remains on the
stack so that it is available for further iterations of the loop.
LOAD_LOCALS ()~
Pushes a reference to the locals of the current scope on the stack. This is used
in the code for a class definition: After the class body is evaluated, the
locals are passed to the class definition.
RETURN_VALUE ()~
Returns with TOS to the caller of the function.
YIELD_VALUE ()~
Pops ``TOS`` and yields it from a generator.
IMPORT_STAR ()~
Loads all symbols not starting with ``'_'`` directly from the module TOS to the
local namespace. The module is popped after loading all names. This opcode
implements ``from module import *``.
EXEC_STMT ()~
Implements ``exec TOS2,TOS1,TOS``. The compiler fills missing optional
parameters with ``None``.
POP_BLOCK ()~
Removes one block from the block stack. Per frame, there is a stack of blocks,
denoting nested loops, try statements, and such.
END_FINALLY ()~
Terminates a finally clause. The interpreter recalls whether the
exception has to be re-raised, or whether the function returns, and continues
with the outer-next block.
BUILD_CLASS ()~
Creates a new class object. TOS is the methods dictionary, TOS1 the tuple of
the names of the base classes, and TOS2 the class name.
SETUP_WITH (delta)~
This opcode performs several operations before a with block starts. First,
it loads object.__exit__ from the context manager and pushes it onto
the stack for later use by WITH_CLEANUP. Then,
object.__enter__ is called, and a finally block pointing to {delta}
is pushed. Finally, the result of calling the enter method is pushed onto
the stack. The next opcode will either ignore it (POP_TOP), or
store it in (a) variable(s) (STORE_FAST, STORE_NAME, or
UNPACK_SEQUENCE).
WITH_CLEANUP ()~
Cleans up the stack when a with statement block exits. On top of
the stack are 1--3 values indicating how/why the finally clause was entered:
* TOP = ``None``
* (TOP, SECOND) = (``WHY_{RETURN,CONTINUE}``), retval
{ TOP = ``WHY_}``; no retval below it
* (TOP, SECOND, THIRD) = exc_info()
Under them is EXIT, the context manager's __exit__ bound method.
In the last case, ``EXIT(TOP, SECOND, THIRD)`` is called, otherwise
``EXIT(None, None, None)``.
EXIT is removed from the stack, leaving the values above it in the same
order. In addition, if the stack represents an exception, {and} the function
call returns a 'true' value, this information is "zapped", to prevent
``END_FINALLY`` from re-raising the exception. (But non-local gotos should
still be resumed.)
.. XXX explain the WHY stuff!
All of the following opcodes expect arguments. An argument is two bytes, with
the more significant byte last.
STORE_NAME (namei)~
Implements ``name = TOS``. {namei} is the index of {name} in the attribute
co_names of the code object. The compiler tries to use ``STORE_FAST``
or ``STORE_GLOBAL`` if possible.
DELETE_NAME (namei)~
Implements ``del name``, where {namei} is the index into co_names
attribute of the code object.
UNPACK_SEQUENCE (count)~
Unpacks TOS into {count} individual values, which are put onto the stack
right-to-left.
DUP_TOPX (count)~
Duplicate {count} items, keeping them in the same order. Due to implementation
limits, {count} should be between 1 and 5 inclusive.
STORE_ATTR (namei)~
Implements ``TOS.name = TOS1``, where {namei} is the index of name in
co_names.
DELETE_ATTR (namei)~
Implements ``del TOS.name``, using {namei} as index into co_names.
STORE_GLOBAL (namei)~
Works as ``STORE_NAME``, but stores the name as a global.
DELETE_GLOBAL (namei)~
Works as ``DELETE_NAME``, but deletes a global name.
LOAD_CONST (consti)~
Pushes ``co_consts[consti]`` onto the stack.
LOAD_NAME (namei)~
Pushes the value associated with ``co_names[namei]`` onto the stack.
BUILD_TUPLE (count)~
Creates a tuple consuming {count} items from the stack, and pushes the resulting
tuple onto the stack.
BUILD_LIST (count)~
Works as ``BUILD_TUPLE``, but creates a list.
BUILD_MAP (count)~
Pushes a new dictionary object onto the stack. The dictionary is pre-sized
to hold {count} entries.
LOAD_ATTR (namei)~
Replaces TOS with ``getattr(TOS, co_names[namei])``.
COMPARE_OP (opname)~
Performs a Boolean operation. The operation name can be found in
``cmp_op[opname]``.
IMPORT_NAME (namei)~
Imports the module ``co_names[namei]``. TOS and TOS1 are popped and provide
the {fromlist} and {level} arguments of __import__. The module
object is pushed onto the stack. The current namespace is not affected:
for a proper import statement, a subsequent ``STORE_FAST`` instruction
modifies the namespace.
IMPORT_FROM (namei)~
Loads the attribute ``co_names[namei]`` from the module found in TOS. The
resulting object is pushed onto the stack, to be subsequently stored by a
``STORE_FAST`` instruction.
JUMP_FORWARD (delta)~
Increments bytecode counter by {delta}.
POP_JUMP_IF_TRUE (target)~
If TOS is true, sets the bytecode counter to {target}. TOS is popped.
POP_JUMP_IF_FALSE (target)~
If TOS is false, sets the bytecode counter to {target}. TOS is popped.
JUMP_IF_TRUE_OR_POP (target)~
If TOS is true, sets the bytecode counter to {target} and leaves TOS
on the stack. Otherwise (TOS is false), TOS is popped.
JUMP_IF_FALSE_OR_POP (target)~
If TOS is false, sets the bytecode counter to {target} and leaves
TOS on the stack. Otherwise (TOS is true), TOS is popped.
JUMP_ABSOLUTE (target)~
Set bytecode counter to {target}.
FOR_ITER (delta)~
``TOS`` is an iterator. Call its !next method. If this
yields a new value, push it on the stack (leaving the iterator below it). If
the iterator indicates it is exhausted ``TOS`` is popped, and the bytecode
counter is incremented by {delta}.
LOAD_GLOBAL (namei)~
Loads the global named ``co_names[namei]`` onto the stack.
SETUP_LOOP (delta)~
Pushes a block for a loop onto the block stack. The block spans from the
current instruction with a size of {delta} bytes.
SETUP_EXCEPT (delta)~
Pushes a try block from a try-except clause onto the block stack. {delta} points
to the first except block.
SETUP_FINALLY (delta)~
Pushes a try block from a try-except clause onto the block stack. {delta} points
to the finally block.
STORE_MAP ()~
Store a key and value pair in a dictionary. Pops the key and value while leaving
the dictionary on the stack.
LOAD_FAST (var_num)~
Pushes a reference to the local ``co_varnames[var_num]`` onto the stack.
STORE_FAST (var_num)~
Stores TOS into the local ``co_varnames[var_num]``.
DELETE_FAST (var_num)~
Deletes local ``co_varnames[var_num]``.
LOAD_CLOSURE (i)~
Pushes a reference to the cell contained in slot {i} of the cell and free
variable storage. The name of the variable is ``co_cellvars[i]`` if {i} is
less than the length of {co_cellvars}. Otherwise it is ``co_freevars[i -
len(co_cellvars)]``.
LOAD_DEREF (i)~
Loads the cell contained in slot {i} of the cell and free variable storage.
Pushes a reference to the object the cell contains on the stack.
STORE_DEREF (i)~
Stores TOS into the cell contained in slot {i} of the cell and free variable
storage.
SET_LINENO (lineno)~
This opcode is obsolete.
RAISE_VARARGS (argc)~
Raises an exception. {argc} indicates the number of parameters to the raise
statement, ranging from 0 to 3. The handler will find the traceback as TOS2,
the parameter as TOS1, and the exception as TOS.
CALL_FUNCTION (argc)~
Calls a function. The low byte of {argc} indicates the number of positional
parameters, the high byte the number of keyword parameters. On the stack, the
opcode finds the keyword parameters first. For each keyword argument, the value
is on top of the key. Below the keyword parameters, the positional parameters
are on the stack, with the right-most parameter on top. Below the parameters,
the function object to call is on the stack. Pops all function arguments, and
the function itself off the stack, and pushes the return value.
MAKE_FUNCTION (argc)~
Pushes a new function object on the stack. TOS is the code associated with the
function. The function object is defined to have {argc} default parameters,
which are found below TOS.
MAKE_CLOSURE (argc)~
Creates a new function object, sets its {func_closure} slot, and pushes it on
the stack. TOS is the code associated with the function, TOS1 the tuple
containing cells for the closure's free variables. The function also has
{argc} default parameters, which are found below the cells.
BUILD_SLICE (argc)~
.. index:: builtin: slice
Pushes a slice object on the stack. {argc} must be 2 or 3. If it is 2,
``slice(TOS1, TOS)`` is pushed; if it is 3, ``slice(TOS2, TOS1, TOS)`` is
pushed. See the slice built-in function for more information.
EXTENDED_ARG (ext)~
Prefixes any opcode which has an argument too big to fit into the default two
bytes. {ext} holds two additional bytes which, taken together with the
subsequent opcode's argument, comprise a four-byte argument, {ext} being the two
most-significant bytes.
CALL_FUNCTION_VAR (argc)~
Calls a function. {argc} is interpreted as in ``CALL_FUNCTION``. The top element
on the stack contains the variable argument list, followed by keyword and
positional arguments.
CALL_FUNCTION_KW (argc)~
Calls a function. {argc} is interpreted as in ``CALL_FUNCTION``. The top element
on the stack contains the keyword arguments dictionary, followed by explicit
keyword and positional arguments.
CALL_FUNCTION_VAR_KW (argc)~
Calls a function. {argc} is interpreted as in ``CALL_FUNCTION``. The top
element on the stack contains the keyword arguments dictionary, followed by the
variable-arguments tuple, followed by explicit keyword and positional arguments.
HAVE_ARGUMENT ()~
This is not really an opcode. It identifies the dividing line between opcodes
which don't take arguments ``< HAVE_ARGUMENT`` and those which do ``>=
HAVE_ARGUMENT``.
==============================================================================
*py2stdlib-distutils*
distutils~
:synopsis: Support for building and installing Python modules into an existing Python
installation.
The distutils (|py2stdlib-distutils|) package provides support for building and installing
additional modules into a Python installation. The new modules may be either
100%-pure Python, or may be extension modules written in C, or may be
collections of Python packages which include modules coded in both Python and C.
This package is discussed in two separate chapters:
.. seealso::
distutils-index
The manual for developers and packagers of Python modules. This describes how
to prepare distutils (|py2stdlib-distutils|)\ -based packages so that they may be easily
installed into an existing Python installation.
install-index
An "administrators" manual which includes information on installing modules into
an existing Python installation. You do not need to be a Python programmer to
read this manual.
==============================================================================
*py2stdlib-dl*
dl~
:platform: Unix
:synopsis: Call C functions in shared objects.
:deprecated:
2.6~
The dl (|py2stdlib-dl|) module has been removed in Python 3.0. Use the ctypes (|py2stdlib-ctypes|)
module instead.
The dl (|py2stdlib-dl|) module defines an interface to the dlopen function, which
is the most common interface on Unix platforms for handling dynamically linked
libraries. It allows the program to call arbitrary functions in such a library.
.. warning::
The dl (|py2stdlib-dl|) module bypasses the Python type system and error handling. If
used incorrectly it may cause segmentation faults, crashes or other incorrect
behaviour.
.. note::
This module will not work unless ``sizeof(int) == sizeof(long) == sizeof(char
*)`` If this is not the case, SystemError will be raised on import.
The dl (|py2stdlib-dl|) module defines the following function:
open(name[, mode=RTLD_LAZY])~
Open a shared object file, and return a handle. Mode signifies late binding
(RTLD_LAZY) or immediate binding (RTLD_NOW). Default is
RTLD_LAZY. Note that some systems do not support RTLD_NOW.
Return value is a dlobject.
The dl (|py2stdlib-dl|) module defines the following constants:
RTLD_LAZY~
Useful as an argument to .open.
RTLD_NOW~
Useful as an argument to .open. Note that on systems which do not
support immediate binding, this constant will not appear in the module. For
maximum portability, use hasattr to determine if the system supports
immediate binding.
The dl (|py2stdlib-dl|) module defines the following exception:
error~
Exception raised when an error has occurred inside the dynamic loading and
linking routines.
Example:: >
>>> import dl, time
>>> a=dl.open('/lib/libc.so.6')
>>> a.call('time'), time.time()
(929723914, 929723914.498)
<
This example was tried on a Debian GNU/Linux system, and is a good example of
the fact that using this module is usually a bad alternative.
Dl Objects
----------
Dl objects, as returned by .open above, have the following methods:
dl.close()~
Free all resources, except the memory.
dl.sym(name)~
Return the pointer for the function named {name}, as a number, if it exists in
the referenced shared object, otherwise ``None``. This is useful in code like:: >
>>> if a.sym('time'):
... a.call('time')
... else:
... time.time()
<
(Note that this function will return a non-zero number, as zero is the {NULL}
pointer)
dl.call(name[, arg1[, arg2...]])~
Call the function named {name} in the referenced shared object. The arguments
must be either Python integers, which will be passed as is, Python strings, to
which a pointer will be passed, or ``None``, which will be passed as {NULL}.
Note that strings should only be passed to functions as const char\*,
as Python will not like its string mutated.
There must be at most 10 arguments, and arguments not given will be treated as
``None``. The function's return value must be a C long, which is a
Python integer.
==============================================================================
*py2stdlib-doctest*
doctest~
:synopsis: Test pieces of code within docstrings.
The doctest (|py2stdlib-doctest|) module searches for pieces of text that look like interactive
Python sessions, and then executes those sessions to verify that they work
exactly as shown. There are several common ways to use doctest:
* To check that a module's docstrings are up-to-date by verifying that all
interactive examples still work as documented.
* To perform regression testing by verifying that interactive examples from a
test file or a test object work as expected.
* To write tutorial documentation for a package, liberally illustrated with
input-output examples. Depending on whether the examples or the expository text
are emphasized, this has the flavor of "literate testing" or "executable
documentation".
Here's a complete but small example module:: >
"""
This is the "example" module.
The example module supplies one function, factorial(). For example,
>>> factorial(5)
120
"""
def factorial(n):
"""Return the factorial of n, an exact integer >= 0.
If the result is small enough to fit in an int, return an int.
Else return a long.
>>> [factorial(n) for n in range(6)]
[1, 1, 2, 6, 24, 120]
>>> [factorial(long(n)) for n in range(6)]
[1, 1, 2, 6, 24, 120]
>>> factorial(30)
265252859812191058636308480000000L
>>> factorial(30L)
265252859812191058636308480000000L
>>> factorial(-1)
Traceback (most recent call last):
...
ValueError: n must be >= 0
Factorials of floats are OK, but the float must be an exact integer:
>>> factorial(30.1)
Traceback (most recent call last):
...
ValueError: n must be exact integer
>>> factorial(30.0)
265252859812191058636308480000000L
It must also not be ridiculously large:
>>> factorial(1e100)
Traceback (most recent call last):
...
OverflowError: n too large
"""
import math
if not n >= 0:
raise ValueError("n must be >= 0")
if math.floor(n) != n:
raise ValueError("n must be exact integer")
if n+1 == n: # catch a value like 1e300
raise OverflowError("n too large")
result = 1
factor = 2
while factor <= n:
result *= factor
factor += 1
return result
if __name__ == "__main__":
import doctest
doctest.testmod()
<
If you run example.py directly from the command line, doctest (|py2stdlib-doctest|)
works its magic:: >
$ python example.py
$
<
There's no output! That's normal, and it means all the examples worked. Pass
-v to the script, and doctest (|py2stdlib-doctest|) prints a detailed log of what
it's trying, and prints a summary at the end:: >
$ python example.py -v
Trying:
factorial(5)
Expecting:
120
ok
Trying:
[factorial(n) for n in range(6)]
Expecting:
[1, 1, 2, 6, 24, 120]
ok
Trying:
[factorial(long(n)) for n in range(6)]
Expecting:
[1, 1, 2, 6, 24, 120]
ok
<
And so on, eventually ending with::
Trying:
factorial(1e100)
Expecting:
Traceback (most recent call last):
...
OverflowError: n too large
ok
2 items passed all tests:
1 tests in __main__
8 tests in __main__.factorial
9 tests in 2 items.
9 passed and 0 failed.
Test passed.
$
That's all you need to know to start making productive use of doctest (|py2stdlib-doctest|)!
Jump in. The following sections provide full details. Note that there are many
examples of doctests in the standard Python test suite and libraries.
Especially useful examples can be found in the standard test file
Lib/test/test_doctest.py.
Simple Usage: Checking Examples in Docstrings
---------------------------------------------
The simplest way to start using doctest (but not necessarily the way you'll
continue to do it) is to end each module M with:: >
if __name__ == "__main__":
import doctest
doctest.testmod()
<
doctest (|py2stdlib-doctest|) then examines docstrings in module M.
Running the module as a script causes the examples in the docstrings to get
executed and verified:: >
python M.py
<
This won't display anything unless an example fails, in which case the failing
example(s) and the cause(s) of the failure(s) are printed to stdout, and the
final line of output is ``{Test Failed}{ N failures.``, where }N* is the
number of examples that failed.
Run it with the -v switch instead:: >
python M.py -v
<
and a detailed report of all examples tried is printed to standard output, along
with assorted summaries at the end.
You can force verbose mode by passing ``verbose=True`` to testmod, or
prohibit it by passing ``verbose=False``. In either of those cases,
``sys.argv`` is not examined by testmod (so passing -v or not
has no effect).
Since Python 2.6, there is also a command line shortcut for running
testmod. You can instruct the Python interpreter to run the doctest
module directly from the standard library and pass the module name(s) on the
command line:: >
python -m doctest -v example.py
<
This will import example.py as a standalone module and run
testmod on it. Note that this may not work correctly if the file is
part of a package and imports other submodules from that package.
For more information on testmod, see section doctest-basic-api.
Simple Usage: Checking Examples in a Text File
----------------------------------------------
Another simple application of doctest is testing interactive examples in a text
file. This can be done with the testfile function:: >
import doctest
doctest.testfile("example.txt")
<
That short script executes and verifies any interactive Python examples
contained in the file example.txt. The file content is treated as if it
were a single giant docstring; the file doesn't need to contain a Python
program! For example, perhaps example.txt contains this:: >
The ``example`` module
Using ``factorial``
This is an example text file in reStructuredText format. First import
``factorial`` from the ``example`` module:
>>> from example import factorial
Now use it:
>>> factorial(6)
120
<
Running ``doctest.testfile("example.txt")`` then finds the error in this
documentation:: >
File "./example.txt", line 14, in example.txt
Failed example:
factorial(6)
Expected:
120
Got:
720
<
As with testmod, testfile won't display anything unless an
example fails. If an example does fail, then the failing example(s) and the
cause(s) of the failure(s) are printed to stdout, using the same format as
testmod.
By default, testfile looks for files in the calling module's directory.
See section doctest-basic-api for a description of the optional arguments
that can be used to tell it to look for files in other locations.
Like testmod, testfile's verbosity can be set with the
-v command-line switch or with the optional keyword argument
{verbose}.
Since Python 2.6, there is also a command line shortcut for running
testfile. You can instruct the Python interpreter to run the doctest
module directly from the standard library and pass the file name(s) on the
command line:: >
python -m doctest -v example.txt
<
Because the file name does not end with .py, doctest (|py2stdlib-doctest|) infers that
it must be run with testfile, not testmod.
For more information on testfile, see section doctest-basic-api.
How It Works
------------
This section examines in detail how doctest works: which docstrings it looks at,
how it finds interactive examples, what execution context it uses, how it
handles exceptions, and how option flags can be used to control its behavior.
This is the information that you need to know to write doctest examples; for
information about actually running doctest on these examples, see the following
sections.
Which Docstrings Are Examined?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The module docstring, and all function, class and method docstrings are
searched. Objects imported into the module are not searched.
In addition, if ``M.__test__`` exists and "is true", it must be a dict, and each
entry maps a (string) name to a function object, class object, or string.
Function and class object docstrings found from ``M.__test__`` are searched, and
strings are treated as if they were docstrings. In output, a key ``K`` in
``M.__test__`` appears with name :: >
<name of M>.__test__.K
<
Any classes found are recursively searched similarly, to test docstrings in
their contained methods and nested classes.
.. versionchanged:: 2.4
A "private name" concept is deprecated and no longer documented.
How are Docstring Examples Recognized?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
In most cases a copy-and-paste of an interactive console session works fine,
but doctest isn't trying to do an exact emulation of any specific Python shell.
:: >
>>> # comments are ignored
>>> x = 12
>>> x
12
>>> if x == 13:
... print "yes"
... else:
... print "no"
... print "NO"
... print "NO!!!"
...
no
NO
NO!!!
>>>
<
Any expected output must immediately follow the final ``'>>> '`` or ``'... '``
line containing the code, and the expected output (if any) extends to the next
``'>>> '`` or all-whitespace line.
The fine print:
* Expected output cannot contain an all-whitespace line, since such a line is
taken to signal the end of expected output. If expected output does contain a
blank line, put ``<BLANKLINE>`` in your doctest example each place a blank line
is expected.
.. versionadded:: 2.4
``<BLANKLINE>`` was added; there was no way to use expected output containing
empty lines in previous versions.
* All hard tab characters are expanded to spaces, using 8-column tab stops.
Tabs in output generated by the tested code are not modified. Because any
hard tabs in the sample output {are} expanded, this means that if the code
output includes hard tabs, the only way the doctest can pass is if the
NORMALIZE_WHITESPACE option or directive is in effect.
Alternatively, the test can be rewritten to capture the output and compare it
to an expected value as part of the test. This handling of tabs in the
source was arrived at through trial and error, and has proven to be the least
error prone way of handling them. It is possible to use a different
algorithm for handling tabs by writing a custom DocTestParser class.
.. versionchanged:: 2.4
Expanding tabs to spaces is new; previous versions tried to preserve hard tabs,
with confusing results.
* Output to stdout is captured, but not output to stderr (exception tracebacks
are captured via a different means).
* If you continue a line via backslashing in an interactive session, or for any
other reason use a backslash, you should use a raw docstring, which will
preserve your backslashes exactly as you type them:: >
>>> def f(x):
... r'''Backslashes in a raw docstring: m\n'''
>>> print f.__doc__
Backslashes in a raw docstring: m\n
Otherwise, the backslash will be interpreted as part of the string. For example,
the "\\" above would be interpreted as a newline character. Alternatively, you
can double each backslash in the doctest version (and not use a raw string)::
>>> def f(x):
... '''Backslashes in a raw docstring: m\\n'''
>>> print f.__doc__
Backslashes in a raw docstring: m\n
<
* The starting column doesn't matter::
>>> assert "Easy!"
>>> import math
>>> math.floor(1.9)
1.0
and as many leading whitespace characters are stripped from the expected output
as appeared in the initial ``'>>> '`` line that started the example.
What's the Execution Context?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
By default, each time doctest (|py2stdlib-doctest|) finds a docstring to test, it uses a
{shallow copy} of M's globals, so that running tests doesn't change the
module's real globals, and so that one test in M can't leave behind
crumbs that accidentally allow another test to work. This means examples can
freely use any names defined at top-level in M, and names defined earlier
in the docstring being run. Examples cannot see names defined in other
docstrings.
You can force use of your own dict as the execution context by passing
``globs=your_dict`` to testmod or testfile instead.
What About Exceptions?
^^^^^^^^^^^^^^^^^^^^^^
No problem, provided that the traceback is the only output produced by the
example: just paste in the traceback. [#]_ Since tracebacks contain details
that are likely to change rapidly (for example, exact file paths and line
numbers), this is one case where doctest works hard to be flexible in what it
accepts.
Simple example:: >
>>> [1, 2, 3].remove(42)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
ValueError: list.remove(x): x not in list
<
That doctest succeeds if ValueError is raised, with the ``list.remove(x):
x not in list`` detail as shown.
The expected output for an exception must start with a traceback header, which
may be either of the following two lines, indented the same as the first line of
the example:: >
Traceback (most recent call last):
Traceback (innermost last):
<
The traceback header is followed by an optional traceback stack, whose contents
are ignored by doctest. The traceback stack is typically omitted, or copied
verbatim from an interactive session.
The traceback stack is followed by the most interesting part: the line(s)
containing the exception type and detail. This is usually the last line of a
traceback, but can extend across multiple lines if the exception has a
multi-line detail:: >
>>> raise ValueError('multi\n line\ndetail')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
ValueError: multi
line
detail
<
The last three lines (starting with ValueError) are compared against the
exception's type and detail, and the rest are ignored.
.. versionchanged:: 2.4
Previous versions were unable to handle multi-line exception details.
Best practice is to omit the traceback stack, unless it adds significant
documentation value to the example. So the last example is probably better as:: >
>>> raise ValueError('multi\n line\ndetail')
Traceback (most recent call last):
...
ValueError: multi
line
detail
<
Note that tracebacks are treated very specially. In particular, in the
rewritten example, the use of ``...`` is independent of doctest's
ELLIPSIS option. The ellipsis in that example could be left out, or
could just as well be three (or three hundred) commas or digits, or an indented
transcript of a Monty Python skit.
Some details you should read once, but won't need to remember:
* Doctest can't guess whether your expected output came from an exception
traceback or from ordinary printing. So, e.g., an example that expects
``ValueError: 42 is prime`` will pass whether ValueError is actually
raised or if the example merely prints that traceback text. In practice,
ordinary output rarely begins with a traceback header line, so this doesn't
create real problems.
* Each line of the traceback stack (if present) must be indented further than
the first line of the example, {or} start with a non-alphanumeric character.
The first line following the traceback header indented the same and starting
with an alphanumeric is taken to be the start of the exception detail. Of
course this does the right thing for genuine tracebacks.
* When the IGNORE_EXCEPTION_DETAIL doctest option is specified,
everything following the leftmost colon and any module information in the
exception name is ignored.
* The interactive shell omits the traceback header line for some
SyntaxError\ s. But doctest uses the traceback header line to
distinguish exceptions from non-exceptions. So in the rare case where you need
to test a SyntaxError that omits the traceback header, you will need to
manually add the traceback header line to your test example.
* For some SyntaxError\ s, Python displays the character position of the
syntax error, using a ``^`` marker:: >
>>> 1 1
File "<stdin>", line 1
1 1
^
SyntaxError: invalid syntax
Since the lines showing the position of the error come before the exception type
and detail, they are not checked by doctest. For example, the following test
would pass, even though it puts the ``^`` marker in the wrong location::
>>> 1 1
Traceback (most recent call last):
File "<stdin>", line 1
1 1
^
SyntaxError: invalid syntax
<
Option Flags and Directives
A number of option flags control various aspects of doctest's behavior.
Symbolic names for the flags are supplied as module constants, which can be
or'ed together and passed to various functions. The names can also be used in
doctest directives (see below).
The first group of options define test semantics, controlling aspects of how
doctest decides whether actual output matches an example's expected output:
DONT_ACCEPT_TRUE_FOR_1~
By default, if an expected output block contains just ``1``, an actual output
block containing just ``1`` or just ``True`` is considered to be a match, and
similarly for ``0`` versus ``False``. When DONT_ACCEPT_TRUE_FOR_1 is
specified, neither substitution is allowed. The default behavior caters to that
Python changed the return type of many functions from integer to boolean;
doctests expecting "little integer" output still work in these cases. This
option will probably go away, but not for several years.
DONT_ACCEPT_BLANKLINE~
By default, if an expected output block contains a line containing only the
string ``<BLANKLINE>``, then that line will match a blank line in the actual
output. Because a genuinely blank line delimits the expected output, this is
the only way to communicate that a blank line is expected. When
DONT_ACCEPT_BLANKLINE is specified, this substitution is not allowed.
NORMALIZE_WHITESPACE~
When specified, all sequences of whitespace (blanks and newlines) are treated as
equal. Any sequence of whitespace within the expected output will match any
sequence of whitespace within the actual output. By default, whitespace must
match exactly. NORMALIZE_WHITESPACE is especially useful when a line of
expected output is very long, and you want to wrap it across multiple lines in
your source.
ELLIPSIS~
When specified, an ellipsis marker (``...``) in the expected output can match
any substring in the actual output. This includes substrings that span line
boundaries, and empty substrings, so it's best to keep usage of this simple.
Complicated uses can lead to the same kinds of "oops, it matched too much!"
surprises that ``.*`` is prone to in regular expressions.
IGNORE_EXCEPTION_DETAIL~
When specified, an example that expects an exception passes if an exception of
the expected type is raised, even if the exception detail does not match. For
example, an example expecting ``ValueError: 42`` will pass if the actual
exception raised is ``ValueError: 3*14``, but will fail, e.g., if
TypeError is raised.
It will also ignore the module name used in Python 3 doctest reports. Hence
both these variations will work regardless of whether the test is run under
Python 2.7 or Python 3.2 (or later versions):
>>> raise CustomError('message') #doctest: +IGNORE_EXCEPTION_DETAIL
Traceback (most recent call last):
CustomError: message
>>> raise CustomError('message') #doctest: +IGNORE_EXCEPTION_DETAIL
Traceback (most recent call last):
my_module.CustomError: message
Note that ELLIPSIS can also be used to ignore the
details of the exception message, but such a test may still fail based
on whether or not the module details are printed as part of the
exception name. Using IGNORE_EXCEPTION_DETAIL and the details
from Python 2.3 is also the only clear way to write a doctest that doesn't
care about the exception detail yet continues to pass under Python 2.3 or
earlier (those releases do not support doctest directives and ignore them
as irrelevant comments). For example, :: >
>>> (1, 2)[3] = 'moo' #doctest: +IGNORE_EXCEPTION_DETAIL
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: object doesn't support item assignment
<
passes under Python 2.3 and later Python versions, even though the detail
changed in Python 2.4 to say "does not" instead of "doesn't".
.. versionchanged:: 2.7
IGNORE_EXCEPTION_DETAIL now also ignores any information
relating to the module containing the exception under test
SKIP~
When specified, do not run the example at all. This can be useful in contexts
where doctest examples serve as both documentation and test cases, and an
example should be included for documentation purposes, but should not be
checked. E.g., the example's output might be random; or the example might
depend on resources which would be unavailable to the test driver.
The SKIP flag can also be used for temporarily "commenting out" examples.
.. versionadded:: 2.5
COMPARISON_FLAGS~
A bitmask or'ing together all the comparison flags above.
The second group of options controls how test failures are reported:
REPORT_UDIFF~
When specified, failures that involve multi-line expected and actual outputs are
displayed using a unified diff.
REPORT_CDIFF~
When specified, failures that involve multi-line expected and actual outputs
will be displayed using a context diff.
REPORT_NDIFF~
When specified, differences are computed by ``difflib.Differ``, using the same
algorithm as the popular ndiff.py utility. This is the only method that
marks differences within lines as well as across lines. For example, if a line
of expected output contains digit ``1`` where actual output contains letter
``l``, a line is inserted with a caret marking the mismatching column positions.
REPORT_ONLY_FIRST_FAILURE~
When specified, display the first failing example in each doctest, but suppress
output for all remaining examples. This will prevent doctest from reporting
correct examples that break because of earlier failures; but it might also hide
incorrect examples that fail independently of the first failure. When
REPORT_ONLY_FIRST_FAILURE is specified, the remaining examples are
still run, and still count towards the total number of failures reported; only
the output is suppressed.
REPORTING_FLAGS~
A bitmask or'ing together all the reporting flags above.
"Doctest directives" may be used to modify the option flags for individual
examples. Doctest directives are expressed as a special Python comment
following an example's source code:
.. productionlist:: doctest
directive: "#" "doctest:" `directive_options`
directive_options: `directive_option` ("," `directive_option`)\*
directive_option: `on_or_off` `directive_option_name`
on_or_off: "+" \| "-"
directive_option_name: "DONT_ACCEPT_BLANKLINE" \| "NORMALIZE_WHITESPACE" \| ...
Whitespace is not allowed between the ``+`` or ``-`` and the directive option
name. The directive option name can be any of the option flag names explained
above.
An example's doctest directives modify doctest's behavior for that single
example. Use ``+`` to enable the named behavior, or ``-`` to disable it.
For example, this test passes:: >
>>> print range(20) #doctest: +NORMALIZE_WHITESPACE
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
<
Without the directive it would fail, both because the actual output doesn't have
two blanks before the single-digit list elements, and because the actual output
is on a single line. This test also passes, and also requires a directive to do
so:: >
>>> print range(20) # doctest:+ELLIPSIS
[0, 1, ..., 18, 19]
<
Multiple directives can be used on a single physical line, separated by commas::
>>> print range(20) # doctest: +ELLIPSIS, +NORMALIZE_WHITESPACE
[0, 1, ..., 18, 19]
If multiple directive comments are used for a single example, then they are
combined:: >
>>> print range(20) # doctest: +ELLIPSIS
... # doctest: +NORMALIZE_WHITESPACE
[0, 1, ..., 18, 19]
<
As the previous example shows, you can add ``...`` lines to your example
containing only directives. This can be useful when an example is too long for
a directive to comfortably fit on the same line:: >
>>> print range(5) + range(10,20) + range(30,40) + range(50,60)
... # doctest: +ELLIPSIS
[0, ..., 4, 10, ..., 19, 30, ..., 39, 50, ..., 59]
<
Note that since all options are disabled by default, and directives apply only
to the example they appear in, enabling options (via ``+`` in a directive) is
usually the only meaningful choice. However, option flags can also be passed to
functions that run doctests, establishing different defaults. In such cases,
disabling an option via ``-`` in a directive can be useful.
.. versionadded:: 2.4
Doctest directives and the associated constants
DONT_ACCEPT_BLANKLINE, NORMALIZE_WHITESPACE,
ELLIPSIS, IGNORE_EXCEPTION_DETAIL, REPORT_UDIFF,
REPORT_CDIFF, REPORT_NDIFF,
REPORT_ONLY_FIRST_FAILURE, COMPARISON_FLAGS and
REPORTING_FLAGS were added.
There's also a way to register new option flag names, although this isn't useful
unless you intend to extend doctest (|py2stdlib-doctest|) internals via subclassing:
register_optionflag(name)~
Create a new option flag with a given name, and return the new flag's integer
value. register_optionflag can be used when subclassing
OutputChecker or DocTestRunner to create new options that are
supported by your subclasses. register_optionflag should always be
called using the following idiom:: >
MY_FLAG = register_optionflag('MY_FLAG')
<
.. versionadded:: 2.4
Warnings
^^^^^^^^
doctest (|py2stdlib-doctest|) is serious about requiring exact matches in expected output. If
even a single character doesn't match, the test fails. This will probably
surprise you a few times, as you learn exactly what Python does and doesn't
guarantee about output. For example, when printing a dict, Python doesn't
guarantee that the key-value pairs will be printed in any particular order, so a
test like :: >
>>> foo()
{"Hermione": "hippogryph", "Harry": "broomstick"}
<
is vulnerable! One workaround is to do ::
>>> foo() == {"Hermione": "hippogryph", "Harry": "broomstick"}
True
instead. Another is to do :: >
>>> d = foo().items()
>>> d.sort()
>>> d
[('Harry', 'broomstick'), ('Hermione', 'hippogryph')]
<
There are others, but you get the idea.
Another bad idea is to print things that embed an object address, like :: >
>>> id(1.0) # certain to fail some of the time
7948648
>>> class C: pass
>>> C() # the default repr() for instances embeds an address
<__main__.C instance at 0x00AC18F0>
<
The ELLIPSIS directive gives a nice approach for the last example::
>>> C() #doctest: +ELLIPSIS
<__main__.C instance at 0x...>
Floating-point numbers are also subject to small output variations across
platforms, because Python defers to the platform C library for float formatting,
and C libraries vary widely in quality here. :: >
>>> 1./7 # risky
0.14285714285714285
>>> print 1./7 # safer
0.142857142857
>>> print round(1./7, 6) # much safer
0.142857
<
Numbers of the form ``I/2.{}J`` are safe across all platforms, and I often
contrive doctest examples to produce numbers of that form:: >
>>> 3./4 # utterly safe
0.75
<
Simple fractions are also easier for people to understand, and that makes for
better documentation.
Basic API
---------
The functions testmod and testfile provide a simple interface to
doctest that should be sufficient for most basic uses. For a less formal
introduction to these two functions, see sections doctest-simple-testmod
and doctest-simple-testfile.
testfile(filename[, module_relative][, name][, package][, globs][, verbose][, report][, optionflags][, extraglobs][, raise_on_error][, parser][, encoding])~
All arguments except {filename} are optional, and should be specified in keyword
form.
Test examples in the file named {filename}. Return ``(failure_count,
test_count)``.
Optional argument {module_relative} specifies how the filename should be
interpreted:
{ If }module_relative{ is ``True`` (the default), then }filename* specifies an
OS-independent module-relative path. By default, this path is relative to the
calling module's directory; but if the {package} argument is specified, then it
is relative to that package. To ensure OS-independence, {filename} should use
``/`` characters to separate path segments, and may not be an absolute path
(i.e., it may not begin with ``/``).
{ If }module_relative{ is ``False``, then }filename* specifies an OS-specific
path. The path may be absolute or relative; relative paths are resolved with
respect to the current working directory.
Optional argument {name} gives the name of the test; by default, or if ``None``,
``os.path.basename(filename)`` is used.
Optional argument {package} is a Python package or the name of a Python package
whose directory should be used as the base directory for a module-relative
filename. If no package is specified, then the calling module's directory is
used as the base directory for module-relative filenames. It is an error to
specify {package} if {module_relative} is ``False``.
Optional argument {globs} gives a dict to be used as the globals when executing
examples. A new shallow copy of this dict is created for the doctest, so its
examples start with a clean slate. By default, or if ``None``, a new empty dict
is used.
Optional argument {extraglobs} gives a dict merged into the globals used to
execute examples. This works like dict.update: if {globs} and
{extraglobs} have a common key, the associated value in {extraglobs} appears in
the combined dict. By default, or if ``None``, no extra globals are used. This
is an advanced feature that allows parameterization of doctests. For example, a
doctest can be written for a base class, using a generic name for the class,
then reused to test any number of subclasses by passing an {extraglobs} dict
mapping the generic name to the subclass to be tested.
Optional argument {verbose} prints lots of stuff if true, and prints only
failures if false; by default, or if ``None``, it's true if and only if ``'-v'``
is in ``sys.argv``.
Optional argument {report} prints a summary at the end when true, else prints
nothing at the end. In verbose mode, the summary is detailed, else the summary
is very brief (in fact, empty if all tests passed).
Optional argument {optionflags} or's together option flags. See section
doctest-options.
Optional argument {raise_on_error} defaults to false. If true, an exception is
raised upon the first failure or unexpected exception in an example. This
allows failures to be post-mortem debugged. Default behavior is to continue
running examples.
Optional argument {parser} specifies a DocTestParser (or subclass) that
should be used to extract tests from the files. It defaults to a normal parser
(i.e., ``DocTestParser()``).
Optional argument {encoding} specifies an encoding that should be used to
convert the file to unicode.
.. versionadded:: 2.4
.. versionchanged:: 2.5
The parameter {encoding} was added.
testmod([m][, name][, globs][, verbose][, report][, optionflags][, extraglobs][, raise_on_error][, exclude_empty])~
All arguments are optional, and all except for {m} should be specified in
keyword form.
Test examples in docstrings in functions and classes reachable from module {m}
(or module __main__ (|py2stdlib-__main__|) if {m} is not supplied or is ``None``), starting with
``m.__doc__``.
Also test examples reachable from dict ``m.__test__``, if it exists and is not
``None``. ``m.__test__`` maps names (strings) to functions, classes and
strings; function and class docstrings are searched for examples; strings are
searched directly, as if they were docstrings.
Only docstrings attached to objects belonging to module {m} are searched.
Return ``(failure_count, test_count)``.
Optional argument {name} gives the name of the module; by default, or if
``None``, ``m.__name__`` is used.
Optional argument {exclude_empty} defaults to false. If true, objects for which
no doctests are found are excluded from consideration. The default is a backward
compatibility hack, so that code still using doctest.master.summarize in
conjunction with testmod continues to get output for objects with no
tests. The {exclude_empty} argument to the newer DocTestFinder
constructor defaults to true.
Optional arguments {extraglobs}, {verbose}, {report}, {optionflags},
{raise_on_error}, and {globs} are the same as for function testfile
above, except that {globs} defaults to ``m.__dict__``.
.. versionchanged:: 2.3
The parameter {optionflags} was added.
.. versionchanged:: 2.4
The parameters {extraglobs}, {raise_on_error} and {exclude_empty} were added.
.. versionchanged:: 2.5
The optional argument {isprivate}, deprecated in 2.4, was removed.
There's also a function to run the doctests associated with a single object.
This function is provided for backward compatibility. There are no plans to
deprecate it, but it's rarely useful:
run_docstring_examples(f, globs[, verbose][, name][, compileflags][, optionflags])~
Test examples associated with object {f}; for example, {f} may be a module,
function, or class object.
A shallow copy of dictionary argument {globs} is used for the execution context.
Optional argument {name} is used in failure messages, and defaults to
``"NoName"``.
If optional argument {verbose} is true, output is generated even if there are no
failures. By default, output is generated only in case of an example failure.
Optional argument {compileflags} gives the set of flags that should be used by
the Python compiler when running the examples. By default, or if ``None``,
flags are deduced corresponding to the set of future features found in {globs}.
Optional argument {optionflags} works as for function testfile above.
Unittest API
------------
As your collection of doctest'ed modules grows, you'll want a way to run all
their doctests systematically. Prior to Python 2.4, doctest (|py2stdlib-doctest|) had a barely
documented Tester class that supplied a rudimentary way to combine
doctests from multiple modules. Tester was feeble, and in practice most
serious Python testing frameworks build on the unittest (|py2stdlib-unittest|) module, which
supplies many flexible ways to combine tests from multiple sources. So, in
Python 2.4, doctest (|py2stdlib-doctest|)'s Tester class is deprecated, and
test suites from modules and text files containing doctests. These test suites
can then be run using unittest (|py2stdlib-unittest|) test runners:: >
import unittest
import doctest
import my_module_with_doctests, and_another
suite = unittest.TestSuite()
for mod in my_module_with_doctests, and_another:
suite.addTest(doctest.DocTestSuite(mod))
runner = unittest.TextTestRunner()
runner.run(suite)
<
There are two main functions for creating unittest.TestSuite instances
from text files and modules with doctests:
DocFileSuite(*paths, [module_relative][, package][, setUp][, tearDown][, globs][, optionflags][, parser][, encoding])~
Convert doctest tests from one or more text files to a
unittest.TestSuite.
The returned unittest.TestSuite is to be run by the unittest framework
and runs the interactive examples in each file. If an example in any file
fails, then the synthesized unit test fails, and a failureException
exception is raised showing the name of the file containing the test and a
(sometimes approximate) line number.
Pass one or more paths (as strings) to text files to be examined.
Options may be provided as keyword arguments:
Optional argument {module_relative} specifies how the filenames in {paths}
should be interpreted:
{ If }module_relative* is ``True`` (the default), then each filename in
{paths} specifies an OS-independent module-relative path. By default, this
path is relative to the calling module's directory; but if the {package}
argument is specified, then it is relative to that package. To ensure
OS-independence, each filename should use ``/`` characters to separate path
segments, and may not be an absolute path (i.e., it may not begin with
``/``).
{ If }module_relative{ is ``False``, then each filename in }paths* specifies
an OS-specific path. The path may be absolute or relative; relative paths
are resolved with respect to the current working directory.
Optional argument {package} is a Python package or the name of a Python
package whose directory should be used as the base directory for
module-relative filenames in {paths}. If no package is specified, then the
calling module's directory is used as the base directory for module-relative
filenames. It is an error to specify {package} if {module_relative} is
``False``.
Optional argument {setUp} specifies a set-up function for the test suite.
This is called before running the tests in each file. The {setUp} function
will be passed a DocTest object. The setUp function can access the
test globals as the {globs} attribute of the test passed.
Optional argument {tearDown} specifies a tear-down function for the test
suite. This is called after running the tests in each file. The {tearDown}
function will be passed a DocTest object. The setUp function can
access the test globals as the {globs} attribute of the test passed.
Optional argument {globs} is a dictionary containing the initial global
variables for the tests. A new copy of this dictionary is created for each
test. By default, {globs} is a new empty dictionary.
Optional argument {optionflags} specifies the default doctest options for the
tests, created by or-ing together individual option flags. See section
doctest-options. See function set_unittest_reportflags below
for a better way to set reporting options.
Optional argument {parser} specifies a DocTestParser (or subclass)
that should be used to extract tests from the files. It defaults to a normal
parser (i.e., ``DocTestParser()``).
Optional argument {encoding} specifies an encoding that should be used to
convert the file to unicode.
.. versionadded:: 2.4
.. versionchanged:: 2.5
The global ``__file__`` was added to the globals provided to doctests
loaded from a text file using DocFileSuite.
.. versionchanged:: 2.5
The parameter {encoding} was added.
DocTestSuite([module][, globs][, extraglobs][, test_finder][, setUp][, tearDown][, checker])~
Convert doctest tests for a module to a unittest.TestSuite.
The returned unittest.TestSuite is to be run by the unittest framework
and runs each doctest in the module. If any of the doctests fail, then the
synthesized unit test fails, and a failureException exception is raised
showing the name of the file containing the test and a (sometimes approximate)
line number.
Optional argument {module} provides the module to be tested. It can be a module
object or a (possibly dotted) module name. If not specified, the module calling
this function is used.
Optional argument {globs} is a dictionary containing the initial global
variables for the tests. A new copy of this dictionary is created for each
test. By default, {globs} is a new empty dictionary.
Optional argument {extraglobs} specifies an extra set of global variables, which
is merged into {globs}. By default, no extra globals are used.
Optional argument {test_finder} is the DocTestFinder object (or a
drop-in replacement) that is used to extract doctests from the module.
Optional arguments {setUp}, {tearDown}, and {optionflags} are the same as for
function DocFileSuite above.
.. versionadded:: 2.3
.. versionchanged:: 2.4
The parameters {globs}, {extraglobs}, {test_finder}, {setUp}, {tearDown}, and
{optionflags} were added; this function now uses the same search technique as
testmod.
Under the covers, DocTestSuite creates a unittest.TestSuite out
of doctest.DocTestCase instances, and DocTestCase is a
subclass of unittest.TestCase. DocTestCase isn't documented
here (it's an internal detail), but studying its code can answer questions about
the exact details of unittest (|py2stdlib-unittest|) integration.
Similarly, DocFileSuite creates a unittest.TestSuite out of
doctest.DocFileCase instances, and DocFileCase is a subclass
of DocTestCase.
So both ways of creating a unittest.TestSuite run instances of
DocTestCase. This is important for a subtle reason: when you run
doctest (|py2stdlib-doctest|) functions yourself, you can control the doctest (|py2stdlib-doctest|) options in
use directly, by passing option flags to doctest (|py2stdlib-doctest|) functions. However, if
you're writing a unittest (|py2stdlib-unittest|) framework, unittest (|py2stdlib-unittest|) ultimately controls
when and how tests get run. The framework author typically wants to control
doctest (|py2stdlib-doctest|) reporting options (perhaps, e.g., specified by command line
options), but there's no way to pass options through unittest (|py2stdlib-unittest|) to
doctest (|py2stdlib-doctest|) test runners.
For this reason, doctest (|py2stdlib-doctest|) also supports a notion of doctest (|py2stdlib-doctest|)
reporting flags specific to unittest (|py2stdlib-unittest|) support, via this function:
set_unittest_reportflags(flags)~
Set the doctest (|py2stdlib-doctest|) reporting flags to use.
Argument {flags} or's together option flags. See section
doctest-options. Only "reporting flags" can be used.
This is a module-global setting, and affects all future doctests run by module
unittest (|py2stdlib-unittest|): the runTest method of DocTestCase looks at
the option flags specified for the test case when the DocTestCase
instance was constructed. If no reporting flags were specified (which is the
typical and expected case), doctest (|py2stdlib-doctest|)'s unittest (|py2stdlib-unittest|) reporting flags are
or'ed into the option flags, and the option flags so augmented are passed to the
DocTestRunner instance created to run the doctest. If any reporting
flags were specified when the DocTestCase instance was constructed,
doctest (|py2stdlib-doctest|)'s unittest (|py2stdlib-unittest|) reporting flags are ignored.
The value of the unittest (|py2stdlib-unittest|) reporting flags in effect before the function
was called is returned by the function.
.. versionadded:: 2.4
Advanced API
------------
The basic API is a simple wrapper that's intended to make doctest easy to use.
It is fairly flexible, and should meet most users' needs; however, if you
require more fine-grained control over testing, or wish to extend doctest's
capabilities, then you should use the advanced API.
The advanced API revolves around two container classes, which are used to store
the interactive examples extracted from doctest cases:
* Example: A single Python statement, paired with its expected
output.
* DocTest: A collection of Example\ s, typically extracted
from a single docstring or text file.
Additional processing classes are defined to find, parse, and run, and check
doctest examples:
* DocTestFinder: Finds all docstrings in a given module, and uses a
DocTestParser to create a DocTest from every docstring that
contains interactive examples.
* DocTestParser: Creates a DocTest object from a string (such
as an object's docstring).
* DocTestRunner: Executes the examples in a DocTest, and uses
an OutputChecker to verify their output.
* OutputChecker: Compares the actual output from a doctest example with
the expected output, and decides whether they match.
The relationships among these processing classes are summarized in the following
diagram:: >
list of:
+------+ +---------+
|module| --DocTestFinder-> | DocTest | --DocTestRunner-> results
+------+ | ^ +---------+ | ^ (printed)
| | | Example | | |
v | | ... | v |
DocTestParser | Example | OutputChecker
+---------+
<
DocTest Objects
DocTest(examples, globs, name, filename, lineno, docstring)~
A collection of doctest examples that should be run in a single namespace. The
constructor arguments are used to initialize the member variables of the same
names.
.. versionadded:: 2.4
DocTest defines the following member variables. They are initialized by
the constructor, and should not be modified directly.
examples~
A list of Example objects encoding the individual interactive Python
examples that should be run by this test.
globs~
The namespace (aka globals) that the examples should be run in. This is a
dictionary mapping names to values. Any changes to the namespace made by the
examples (such as binding new variables) will be reflected in globs
after the test is run.
name~
A string name identifying the DocTest. Typically, this is the name
of the object or file that the test was extracted from.
filename~
The name of the file that this DocTest was extracted from; or
``None`` if the filename is unknown, or if the DocTest was not
extracted from a file.
lineno~
The line number within filename where this DocTest begins, or
``None`` if the line number is unavailable. This line number is zero-based
with respect to the beginning of the file.
docstring~
The string that the test was extracted from, or 'None' if the string is
unavailable, or if the test was not extracted from a string.
Example Objects
^^^^^^^^^^^^^^^
Example(source, want[, exc_msg][, lineno][, indent][, options])~
A single interactive example, consisting of a Python statement and its expected
output. The constructor arguments are used to initialize the member variables
of the same names.
.. versionadded:: 2.4
Example defines the following member variables. They are initialized by
the constructor, and should not be modified directly.
source~
A string containing the example's source code. This source code consists of a
single Python statement, and always ends with a newline; the constructor adds
a newline when necessary.
want~
The expected output from running the example's source code (either from
stdout, or a traceback in case of exception). want ends with a
newline unless no output is expected, in which case it's an empty string. The
constructor adds a newline when necessary.
exc_msg~
The exception message generated by the example, if the example is expected to
generate an exception; or ``None`` if it is not expected to generate an
exception. This exception message is compared against the return value of
traceback.format_exception_only. exc_msg ends with a newline
unless it's ``None``. The constructor adds a newline if needed.
lineno~
The line number within the string containing this example where the example
begins. This line number is zero-based with respect to the beginning of the
containing string.
indent~
The example's indentation in the containing string, i.e., the number of space
characters that precede the example's first prompt.
options~
A dictionary mapping from option flags to ``True`` or ``False``, which is used
to override default options for this example. Any option flags not contained
in this dictionary are left at their default value (as specified by the
DocTestRunner's optionflags). By default, no options are set.
DocTestFinder objects
^^^^^^^^^^^^^^^^^^^^^
DocTestFinder([verbose][, parser][, recurse][, exclude_empty])~
A processing class used to extract the DocTest\ s that are relevant to
a given object, from its docstring and the docstrings of its contained objects.
DocTest\ s can currently be extracted from the following object types:
modules, functions, classes, methods, staticmethods, classmethods, and
properties.
The optional argument {verbose} can be used to display the objects searched by
the finder. It defaults to ``False`` (no output).
The optional argument {parser} specifies the DocTestParser object (or a
drop-in replacement) that is used to extract doctests from docstrings.
If the optional argument {recurse} is false, then DocTestFinder.find
will only examine the given object, and not any contained objects.
If the optional argument {exclude_empty} is false, then
DocTestFinder.find will include tests for objects with empty docstrings.
.. versionadded:: 2.4
DocTestFinder defines the following method:
find(obj[, name][, module][, globs][, extraglobs])~
Return a list of the DocTest\ s that are defined by {obj}'s
docstring, or by any of its contained objects' docstrings.
The optional argument {name} specifies the object's name; this name will be
used to construct names for the returned DocTest\ s. If {name} is
not specified, then ``obj.__name__`` is used.
The optional parameter {module} is the module that contains the given object.
If the module is not specified or is None, then the test finder will attempt
to automatically determine the correct module. The object's module is used:
{ As a default namespace, if }globs* is not specified.
* To prevent the DocTestFinder from extracting DocTests from objects that are
imported from other modules. (Contained objects with modules other than
{module} are ignored.)
* To find the name of the file containing the object.
* To help find the line number of the object within its file.
If {module} is ``False``, no attempt to find the module will be made. This is
obscure, of use mostly in testing doctest itself: if {module} is ``False``, or
is ``None`` but cannot be found automatically, then all objects are considered
to belong to the (non-existent) module, so all contained objects will
(recursively) be searched for doctests.
The globals for each DocTest is formed by combining {globs} and
{extraglobs} (bindings in {extraglobs} override bindings in {globs}). A new
shallow copy of the globals dictionary is created for each DocTest.
If {globs} is not specified, then it defaults to the module's {__dict__}, if
specified, or ``{}`` otherwise. If {extraglobs} is not specified, then it
defaults to ``{}``.
DocTestParser objects
^^^^^^^^^^^^^^^^^^^^^
DocTestParser()~
A processing class used to extract interactive examples from a string, and use
them to create a DocTest object.
.. versionadded:: 2.4
DocTestParser defines the following methods:
get_doctest(string, globs, name, filename, lineno)~
Extract all doctest examples from the given string, and collect them into a
DocTest object.
{globs}, {name}, {filename}, and {lineno} are attributes for the new
DocTest object. See the documentation for DocTest for more
information.
get_examples(string[, name])~
Extract all doctest examples from the given string, and return them as a list
of Example objects. Line numbers are 0-based. The optional argument
{name} is a name identifying this string, and is only used for error messages.
parse(string[, name])~
Divide the given string into examples and intervening text, and return them as
a list of alternating Example\ s and strings. Line numbers for the
Example\ s are 0-based. The optional argument {name} is a name
identifying this string, and is only used for error messages.
DocTestRunner objects
^^^^^^^^^^^^^^^^^^^^^
DocTestRunner([checker][, verbose][, optionflags])~
A processing class used to execute and verify the interactive examples in a
DocTest.
The comparison between expected outputs and actual outputs is done by an
OutputChecker. This comparison may be customized with a number of
option flags; see section doctest-options for more information. If the
option flags are insufficient, then the comparison may also be customized by
passing a subclass of OutputChecker to the constructor.
The test runner's display output can be controlled in two ways. First, an output
function can be passed to TestRunner.run; this function will be called
with strings that should be displayed. It defaults to ``sys.stdout.write``. If
capturing the output is not sufficient, then the display output can be also
customized by subclassing DocTestRunner, and overriding the methods
report_start, report_success,
report_unexpected_exception, and report_failure.
The optional keyword argument {checker} specifies the OutputChecker
object (or drop-in replacement) that should be used to compare the expected
outputs to the actual outputs of doctest examples.
The optional keyword argument {verbose} controls the DocTestRunner's
verbosity. If {verbose} is ``True``, then information is printed about each
example, as it is run. If {verbose} is ``False``, then only failures are
printed. If {verbose} is unspecified, or ``None``, then verbose output is used
iff the command-line switch -v is used.
The optional keyword argument {optionflags} can be used to control how the test
runner compares expected output to actual output, and how it displays failures.
For more information, see section doctest-options.
.. versionadded:: 2.4
DocTestParser defines the following methods:
report_start(out, test, example)~
Report that the test runner is about to process the given example. This method
is provided to allow subclasses of DocTestRunner to customize their
output; it should not be called directly.
{example} is the example about to be processed. {test} is the test
{containing example}. {out} is the output function that was passed to
DocTestRunner.run.
report_success(out, test, example, got)~
Report that the given example ran successfully. This method is provided to
allow subclasses of DocTestRunner to customize their output; it
should not be called directly.
{example} is the example about to be processed. {got} is the actual output
from the example. {test} is the test containing {example}. {out} is the
output function that was passed to DocTestRunner.run.
report_failure(out, test, example, got)~
Report that the given example failed. This method is provided to allow
subclasses of DocTestRunner to customize their output; it should not
be called directly.
{example} is the example about to be processed. {got} is the actual output
from the example. {test} is the test containing {example}. {out} is the
output function that was passed to DocTestRunner.run.
report_unexpected_exception(out, test, example, exc_info)~
Report that the given example raised an unexpected exception. This method is
provided to allow subclasses of DocTestRunner to customize their
output; it should not be called directly.
{example} is the example about to be processed. {exc_info} is a tuple
containing information about the unexpected exception (as returned by
sys.exc_info). {test} is the test containing {example}. {out} is the
output function that was passed to DocTestRunner.run.
run(test[, compileflags][, out][, clear_globs])~
Run the examples in {test} (a DocTest object), and display the
results using the writer function {out}.
The examples are run in the namespace ``test.globs``. If {clear_globs} is
true (the default), then this namespace will be cleared after the test runs,
to help with garbage collection. If you would like to examine the namespace
after the test completes, then use {clear_globs=False}.
{compileflags} gives the set of flags that should be used by the Python
compiler when running the examples. If not specified, then it will default to
the set of future-import flags that apply to {globs}.
The output of each example is checked using the DocTestRunner's
output checker, and the results are formatted by the
DocTestRunner.report_\* methods.
summarize([verbose])~
Print a summary of all the test cases that have been run by this DocTestRunner,
and return a named tuple ``TestResults(failed, attempted)``.
The optional {verbose} argument controls how detailed the summary is. If the
verbosity is not specified, then the DocTestRunner's verbosity is
used.
.. versionchanged:: 2.6
Use a named tuple.
OutputChecker objects
^^^^^^^^^^^^^^^^^^^^^
OutputChecker()~
A class used to check the whether the actual output from a doctest example
matches the expected output. OutputChecker defines two methods:
check_output, which compares a given pair of outputs, and returns true
if they match; and output_difference, which returns a string describing
the differences between two outputs.
.. versionadded:: 2.4
OutputChecker defines the following methods:
check_output(want, got, optionflags)~
Return ``True`` iff the actual output from an example ({got}) matches the
expected output ({want}). These strings are always considered to match if
they are identical; but depending on what option flags the test runner is
using, several non-exact match types are also possible. See section
doctest-options for more information about option flags.
output_difference(example, got, optionflags)~
Return a string describing the differences between the expected output for a
given example ({example}) and the actual output ({got}). {optionflags} is the
set of option flags used to compare {want} and {got}.
Debugging
---------
Doctest provides several mechanisms for debugging doctest examples:
* Several functions convert doctests to executable Python programs, which can be
run under the Python debugger, pdb (|py2stdlib-pdb|).
* The DebugRunner class is a subclass of DocTestRunner that
raises an exception for the first failing example, containing information about
that example. This information can be used to perform post-mortem debugging on
the example.
* The unittest (|py2stdlib-unittest|) cases generated by DocTestSuite support the
debug method defined by unittest.TestCase.
* You can add a call to pdb.set_trace in a doctest example, and you'll
drop into the Python debugger when that line is executed. Then you can inspect
current values of variables, and so on. For example, suppose a.py
contains just this module docstring:: >
"""
>>> def f(x):
... g(x*2)
>>> def g(x):
... print x+3
... import pdb; pdb.set_trace()
>>> f(3)
9
"""
Then an interactive Python session may look like this::
>>> import a, doctest
>>> doctest.testmod(a)
--Return--
> <doctest a[1]>(3)g()->None
-> import pdb; pdb.set_trace()
(Pdb) list
1 def g(x):
2 print x+3
3 -> import pdb; pdb.set_trace()
[EOF]
(Pdb) print x
6
(Pdb) step
--Return--
> <doctest a[0]>(2)f()->None
-> g(x*2)
(Pdb) list
1 def f(x):
2 -> g(x*2)
[EOF]
(Pdb) print x
3
(Pdb) step
--Return--
> <doctest a[2]>(1)?()->None
-> f(3)
(Pdb) cont
(0, 3)
>>>
.. versionchanged:: 2.4
The ability to use pdb.set_trace usefully inside doctests was added.
<
Functions that convert doctests to Python code, and possibly run the synthesized
code under the debugger:
script_from_examples(s)~
Convert text with examples to a script.
Argument {s} is a string containing doctest examples. The string is converted
to a Python script, where doctest examples in {s} are converted to regular code,
and everything else is converted to Python comments. The generated script is
returned as a string. For example, :: >
import doctest
print doctest.script_from_examples(r"""
Set x and y to 1 and 2.
>>> x, y = 1, 2
Print their sum:
>>> print x+y
3
""")
<
displays::
# Set x and y to 1 and 2.
x, y = 1, 2
#
# Print their sum:
print x+y
# Expected:
## 3
This function is used internally by other functions (see below), but can also be
useful when you want to transform an interactive Python session into a Python
script.
.. versionadded:: 2.4
testsource(module, name)~
Convert the doctest for an object to a script.
Argument {module} is a module object, or dotted name of a module, containing the
object whose doctests are of interest. Argument {name} is the name (within the
module) of the object with the doctests of interest. The result is a string,
containing the object's docstring converted to a Python script, as described for
script_from_examples above. For example, if module a.py
contains a top-level function f, then :: >
import a, doctest
print doctest.testsource(a, "a.f")
<
prints a script version of function f's docstring, with doctests
converted to code, and the rest placed in comments.
.. versionadded:: 2.3
debug(module, name[, pm])~
Debug the doctests for an object.
The {module} and {name} arguments are the same as for function
testsource above. The synthesized Python script for the named object's
docstring is written to a temporary file, and then that file is run under the
control of the Python debugger, pdb (|py2stdlib-pdb|).
A shallow copy of ``module.__dict__`` is used for both local and global
execution context.
Optional argument {pm} controls whether post-mortem debugging is used. If {pm}
has a true value, the script file is run directly, and the debugger gets
involved only if the script terminates via raising an unhandled exception. If
it does, then post-mortem debugging is invoked, via pdb.post_mortem,
passing the traceback object from the unhandled exception. If {pm} is not
specified, or is false, the script is run under the debugger from the start, via
passing an appropriate execfile call to pdb.run.
.. versionadded:: 2.3
.. versionchanged:: 2.4
The {pm} argument was added.
debug_src(src[, pm][, globs])~
Debug the doctests in a string.
This is like function debug above, except that a string containing
doctest examples is specified directly, via the {src} argument.
Optional argument {pm} has the same meaning as in function debug above.
Optional argument {globs} gives a dictionary to use as both local and global
execution context. If not specified, or ``None``, an empty dictionary is used.
If specified, a shallow copy of the dictionary is used.
.. versionadded:: 2.4
The DebugRunner class, and the special exceptions it may raise, are of
most interest to testing framework authors, and will only be sketched here. See
the source code, and especially DebugRunner's docstring (which is a
doctest!) for more details:
DebugRunner([checker][, verbose][, optionflags])~
A subclass of DocTestRunner that raises an exception as soon as a
failure is encountered. If an unexpected exception occurs, an
UnexpectedException exception is raised, containing the test, the
example, and the original exception. If the output doesn't match, then a
DocTestFailure exception is raised, containing the test, the example, and
the actual output.
For information about the constructor parameters and methods, see the
documentation for DocTestRunner in section doctest-advanced-api.
There are two exceptions that may be raised by DebugRunner instances:
DocTestFailure(test, example, got)~
An exception thrown by DocTestRunner to signal that a doctest example's
actual output did not match its expected output. The constructor arguments are
used to initialize the member variables of the same names.
DocTestFailure defines the following member variables:
DocTestFailure.test~
The DocTest object that was being run when the example failed.
DocTestFailure.example~
The Example that failed.
DocTestFailure.got~
The example's actual output.
UnexpectedException(test, example, exc_info)~
An exception thrown by DocTestRunner to signal that a doctest example
raised an unexpected exception. The constructor arguments are used to
initialize the member variables of the same names.
UnexpectedException defines the following member variables:
UnexpectedException.test~
The DocTest object that was being run when the example failed.
UnexpectedException.example~
The Example that failed.
UnexpectedException.exc_info~
A tuple containing information about the unexpected exception, as returned by
sys.exc_info.
Soapbox
-------
As mentioned in the introduction, doctest (|py2stdlib-doctest|) has grown to have three primary
uses:
#. Checking examples in docstrings.
#. Regression testing.
#. Executable documentation / literate testing.
These uses have different requirements, and it is important to distinguish them.
In particular, filling your docstrings with obscure test cases makes for bad
documentation.
When writing a docstring, choose docstring examples with care. There's an art to
this that needs to be learned---it may not be natural at first. Examples should
add genuine value to the documentation. A good example can often be worth many
words. If done with care, the examples will be invaluable for your users, and
will pay back the time it takes to collect them many times over as the years go
by and things change. I'm still amazed at how often one of my doctest (|py2stdlib-doctest|)
examples stops working after a "harmless" change.
Doctest also makes an excellent tool for regression testing, especially if you
don't skimp on explanatory text. By interleaving prose and examples, it becomes
much easier to keep track of what's actually being tested, and why. When a test
fails, good prose can make it much easier to figure out what the problem is, and
how it should be fixed. It's true that you could write extensive comments in
code-based testing, but few programmers do. Many have found that using doctest
approaches instead leads to much clearer tests. Perhaps this is simply because
doctest makes writing prose a little easier than writing code, while writing
comments in code is a little harder. I think it goes deeper than just that:
the natural attitude when writing a doctest-based test is that you want to
explain the fine points of your software, and illustrate them with examples.
This in turn naturally leads to test files that start with the simplest
features, and logically progress to complications and edge cases. A coherent
narrative is the result, instead of a collection of isolated functions that test
isolated bits of functionality seemingly at random. It's a different attitude,
and produces different results, blurring the distinction between testing and
explaining.
Regression testing is best confined to dedicated objects or files. There are
several options for organizing tests:
* Write text files containing test cases as interactive examples, and test the
files using testfile or DocFileSuite. This is recommended,
although is easiest to do for new projects, designed from the start to use
doctest.
* Define functions named ``_regrtest_topic`` that consist of single docstrings,
containing test cases for the named topics. These functions can be included in
the same file as the module, or separated out into a separate test file.
* Define a ``__test__`` dictionary mapping from regression test topics to
docstrings containing test cases.
.. rubric:: Footnotes
.. [#] Examples containing both expected output and an exception are not supported.
Trying to guess where one ends and the other begins is too error-prone, and that
also makes for a confusing test.
==============================================================================
*py2stdlib-docxmlrpcserver*
DocXMLRPCServer~
:synopsis: Self-documenting XML-RPC server implementation.
.. note::
The DocXMLRPCServer (|py2stdlib-docxmlrpcserver|) module has been merged into xmlrpc.server
in Python 3.0. The 2to3 tool will automatically adapt imports when
converting your sources to 3.0.
.. versionadded:: 2.3
The DocXMLRPCServer (|py2stdlib-docxmlrpcserver|) module extends the classes found in
SimpleXMLRPCServer (|py2stdlib-simplexmlrpcserver|) to serve HTML documentation in response to HTTP GET
requests. Servers can either be free standing, using DocXMLRPCServer (|py2stdlib-docxmlrpcserver|),
or embedded in a CGI environment, using DocCGIXMLRPCRequestHandler.
DocXMLRPCServer(addr[, requestHandler[, logRequests[, allow_none[, encoding[, bind_and_activate]]]]])~
Create a new server instance. All parameters have the same meaning as for
SimpleXMLRPCServer.SimpleXMLRPCServer; {requestHandler} defaults to
DocXMLRPCRequestHandler.
DocCGIXMLRPCRequestHandler()~
Create a new instance to handle XML-RPC requests in a CGI environment.
DocXMLRPCRequestHandler()~
Create a new request handler instance. This request handler supports XML-RPC
POST requests, documentation GET requests, and modifies logging so that the
{logRequests} parameter to the DocXMLRPCServer (|py2stdlib-docxmlrpcserver|) constructor parameter is
honored.
DocXMLRPCServer Objects
-----------------------
The DocXMLRPCServer (|py2stdlib-docxmlrpcserver|) class is derived from
SimpleXMLRPCServer.SimpleXMLRPCServer and provides a means of creating
self-documenting, stand alone XML-RPC servers. HTTP POST requests are handled as
XML-RPC method calls. HTTP GET requests are handled by generating pydoc-style
HTML documentation. This allows a server to provide its own web-based
documentation.
DocXMLRPCServer.set_server_title(server_title)~
Set the title used in the generated HTML documentation. This title will be used
inside the HTML "title" element.
DocXMLRPCServer.set_server_name(server_name)~
Set the name used in the generated HTML documentation. This name will appear at
the top of the generated documentation inside a "h1" element.
DocXMLRPCServer.set_server_documentation(server_documentation)~
Set the description used in the generated HTML documentation. This description
will appear as a paragraph, below the server name, in the documentation.
DocCGIXMLRPCRequestHandler
--------------------------
The DocCGIXMLRPCRequestHandler class is derived from
SimpleXMLRPCServer.CGIXMLRPCRequestHandler and provides a means of
creating self-documenting, XML-RPC CGI scripts. HTTP POST requests are handled
as XML-RPC method calls. HTTP GET requests are handled by generating pydoc-style
HTML documentation. This allows a server to provide its own web-based
documentation.
DocCGIXMLRPCRequestHandler.set_server_title(server_title)~
Set the title used in the generated HTML documentation. This title will be used
inside the HTML "title" element.
DocCGIXMLRPCRequestHandler.set_server_name(server_name)~
Set the name used in the generated HTML documentation. This name will appear at
the top of the generated documentation inside a "h1" element.
DocCGIXMLRPCRequestHandler.set_server_documentation(server_documentation)~
Set the description used in the generated HTML documentation. This description
will appear as a paragraph, below the server name, in the documentation.
==============================================================================
*py2stdlib-dumbdbm*
dumbdbm~
:synopsis: Portable implementation of the simple DBM interface.
.. note::
The dumbdbm (|py2stdlib-dumbdbm|) module has been renamed to dbm.dumb in Python 3.0.
The 2to3 tool will automatically adapt imports when converting your
sources to 3.0.
.. index:: single: databases
.. note::
The dumbdbm (|py2stdlib-dumbdbm|) module is intended as a last resort fallback for the
anydbm (|py2stdlib-anydbm|) module when no more robust module is available. The dumbdbm (|py2stdlib-dumbdbm|)
module is not written for speed and is not nearly as heavily used as the other
database modules.
The dumbdbm (|py2stdlib-dumbdbm|) module provides a persistent dictionary-like interface which
is written entirely in Python. Unlike other modules such as gdbm (|py2stdlib-gdbm|) and
bsddb (|py2stdlib-bsddb|), no external library is required. As with other persistent
mappings, the keys and values must always be strings.
The module defines the following:
error~
Raised on dumbdbm-specific errors, such as I/O errors. KeyError is
raised for general mapping errors like specifying an incorrect key.
open(filename[, flag[, mode]])~
Open a dumbdbm database and return a dumbdbm object. The {filename} argument is
the basename of the database file (without any specific extensions). When a
dumbdbm database is created, files with .dat and .dir extensions
are created.
The optional {flag} argument is currently ignored; the database is always opened
for update, and will be created if it does not exist.
The optional {mode} argument is the Unix mode of the file, used only when the
database has to be created. It defaults to octal ``0666`` (and will be modified
by the prevailing umask).
.. versionchanged:: 2.2
The {mode} argument was ignored in earlier versions.
.. seealso::
Module anydbm (|py2stdlib-anydbm|)
Generic interface to ``dbm``\ -style databases.
Module dbm (|py2stdlib-dbm|)
Similar interface to the DBM/NDBM library.
Module gdbm (|py2stdlib-gdbm|)
Similar interface to the GNU GDBM library.
Module shelve (|py2stdlib-shelve|)
Persistence module which stores non-string data.
Module whichdb (|py2stdlib-whichdb|)
Utility module used to determine the type of an existing database.
Dumbdbm Objects
---------------
In addition to the methods provided by the UserDict.DictMixin class,
dumbdbm (|py2stdlib-dumbdbm|) objects provide the following methods.
dumbdbm.sync()~
Synchronize the on-disk directory and data files. This method is called by the
sync method of Shelve objects.
==============================================================================
*py2stdlib-dummy_thread*
dummy_thread~
:synopsis: Drop-in replacement for the thread module.
.. note::
The dummy_thread (|py2stdlib-dummy_thread|) module has been renamed to _dummy_thread in
Python 3.0. The 2to3 tool will automatically adapt imports when
converting your sources to 3.0; however, you should consider using the
high-lever dummy_threading (|py2stdlib-dummy_threading|) module instead.
This module provides a duplicate interface to the thread (|py2stdlib-thread|) module. It is
meant to be imported when the thread (|py2stdlib-thread|) module is not provided on a
platform.
Suggested usage is:: >
try:
import thread as _thread
except ImportError:
import dummy_thread as _thread
<
Be careful to not use this module where deadlock might occur from a thread
being created that blocks waiting for another thread to be created. This often
occurs with blocking I/O.
==============================================================================
*py2stdlib-dummy_threading*
dummy_threading~
:synopsis: Drop-in replacement for the threading module.
This module provides a duplicate interface to the threading (|py2stdlib-threading|) module. It
is meant to be imported when the thread (|py2stdlib-thread|) module is not provided on a
platform.
Suggested usage is:: >
try:
import threading as _threading
except ImportError:
import dummy_threading as _threading
<
Be careful to not use this module where deadlock might occur from a thread
being created that blocks waiting for another thread to be created. This often
occurs with blocking I/O.
==============================================================================
*py2stdlib-device*
DEVICE~
:platform: IRIX
:synopsis: Constants used with the gl module.
:deprecated:
2.6~
The DEVICE (|py2stdlib-device|) module has been deprecated for removal in Python 3.0.
This modules defines the constants used by the Silicon Graphics *Graphics
Library* that C programmers find in the header file ``<gl/device.h>``. Read the
module source file for details.
GL (|py2stdlib-gl^|) --- Constants used with the gl (|py2stdlib-gl|) module
======================================================
==============================================================================
*py2stdlib-encodings.idna*
encodings.idna~
:synopsis: Internationalized Domain Names implementation
.. versionadded:: 2.3
This module implements 3490 (Internationalized Domain Names in
Applications) and 3492 (Nameprep: A Stringprep Profile for
Internationalized Domain Names (IDN)). It builds upon the ``punycode`` encoding
and stringprep (|py2stdlib-stringprep|).
These RFCs together define a protocol to support non-ASCII characters in domain
names. A domain name containing non-ASCII characters (such as
``www.Alliancefrançaise.nu``) is converted into an ASCII-compatible encoding
(ACE, such as ``www.xn--alliancefranaise-npb.nu``). The ACE form of the domain
name is then used in all places where arbitrary characters are not allowed by
the protocol, such as DNS queries, HTTP Host fields, and so
on. This conversion is carried out in the application; if possible invisible to
the user: The application should transparently convert Unicode domain labels to
IDNA on the wire, and convert back ACE labels to Unicode before presenting them
to the user.
Python supports this conversion in several ways: The ``idna`` codec allows to
convert between Unicode and the ACE. Furthermore, the socket (|py2stdlib-socket|) module
transparently converts Unicode host names to ACE, so that applications need not
be concerned about converting host names themselves when they pass them to the
socket module. On top of that, modules that have host names as function
parameters, such as httplib (|py2stdlib-httplib|) and ftplib (|py2stdlib-ftplib|), accept Unicode host names
(httplib (|py2stdlib-httplib|) then also transparently sends an IDNA hostname in the
Host field if it sends that field at all).
When receiving host names from the wire (such as in reverse name lookup), no
automatic conversion to Unicode is performed: Applications wishing to present
such host names to the user should decode them to Unicode.
The module encodings.idna (|py2stdlib-encodings.idna|) also implements the nameprep procedure, which
performs certain normalizations on host names, to achieve case-insensitivity of
international domain names, and to unify similar characters. The nameprep
functions can be used directly if desired.
nameprep(label)~
Return the nameprepped version of {label}. The implementation currently assumes
query strings, so ``AllowUnassigned`` is true.
ToASCII(label)~
Convert a label to ASCII, as specified in 3490. ``UseSTD3ASCIIRules`` is
assumed to be false.
ToUnicode(label)~
Convert a label to Unicode, as specified in 3490.
encodings.utf_8_sig (|py2stdlib-encodings.utf_8_sig|) --- UTF-8 codec with BOM signature
-------------------------------------------------------------
==============================================================================
*py2stdlib-encodings.utf_8_sig*
encodings.utf_8_sig~
:synopsis: UTF-8 codec with BOM signature
.. versionadded:: 2.5
This module implements a variant of the UTF-8 codec: On encoding a UTF-8 encoded
BOM will be prepended to the UTF-8 encoded bytes. For the stateful encoder this
is only done once (on the first write to the byte stream). For decoding an
optional UTF-8 encoded BOM at the start of the data will be skipped.
==============================================================================
*py2stdlib-easydialogs*
EasyDialogs~
:platform: Mac
:synopsis: Basic Macintosh dialogs.
:deprecated:
The EasyDialogs (|py2stdlib-easydialogs|) module contains some simple dialogs for the Macintosh.
The dialogs get launched in a separate application which appears in the dock and
must be clicked on for the dialogs be displayed. All routines take an optional
resource ID parameter {id} with which one can override the DLOG
resource used for the dialog, provided that the dialog items correspond (both
type and item number) to those in the default DLOG resource. See source
code for details.
.. note::
This module has been removed in Python 3.x.
The EasyDialogs (|py2stdlib-easydialogs|) module defines the following functions:
Message(str[, id[, ok]])~
Displays a modal dialog with the message text {str}, which should be at most 255
characters long. The button text defaults to "OK", but is set to the string
argument {ok} if the latter is supplied. Control is returned when the user
clicks the "OK" button.
AskString(prompt[, default[, id[, ok[, cancel]]]])~
Asks the user to input a string value via a modal dialog. {prompt} is the prompt
message, and the optional {default} supplies the initial value for the string
(otherwise ``""`` is used). The text of the "OK" and "Cancel" buttons can be
changed with the {ok} and {cancel} arguments. All strings can be at most 255
bytes long. AskString returns the string entered or None in
case the user cancelled.
AskPassword(prompt[, default[, id[, ok[, cancel]]]])~
Asks the user to input a string value via a modal dialog. Like
AskString, but with the text shown as bullets. The arguments have the
same meaning as for AskString.
AskYesNoCancel(question[, default[, yes[, no[, cancel[, id]]]]])~
Presents a dialog with prompt {question} and three buttons labelled "Yes", "No",
and "Cancel". Returns ``1`` for "Yes", ``0`` for "No" and ``-1`` for "Cancel".
The value of {default} (or ``0`` if {default} is not supplied) is returned when
the RETURN key is pressed. The text of the buttons can be changed with
the {yes}, {no}, and {cancel} arguments; to prevent a button from appearing,
supply ``""`` for the corresponding argument.
ProgressBar([title[, maxval[, label[, id]]]])~
Displays a modeless progress-bar dialog. This is the constructor for the
ProgressBar class described below. {title} is the text string displayed
(default "Working..."), {maxval} is the value at which progress is complete
(default ``0``, indicating that an indeterminate amount of work remains to be
done), and {label} is the text that is displayed above the progress bar itself.
GetArgv([optionlist[ commandlist[, addoldfile[, addnewfile[, addfolder[, id]]]]]])~
Displays a dialog which aids the user in constructing a command-line argument
list. Returns the list in ``sys.argv`` format, suitable for passing as an
argument to getopt.getopt. {addoldfile}, {addnewfile}, and {addfolder}
are boolean arguments. When nonzero, they enable the user to insert into the
command line paths to an existing file, a (possibly) not-yet-existent file, and
a folder, respectively. (Note: Option arguments must appear in the command line
before file and folder arguments in order to be recognized by
getopt.getopt.) Arguments containing spaces can be specified by
enclosing them within single or double quotes. A SystemExit exception is
raised if the user presses the "Cancel" button.
{optionlist} is a list that determines a popup menu from which the allowed
options are selected. Its items can take one of two forms: {optstr} or
``(optstr, descr)``. When present, {descr} is a short descriptive string that
is displayed in the dialog while this option is selected in the popup menu. The
correspondence between {optstr}\s and command-line arguments is:
+----------------------+------------------------------------------+
| {optstr} format | Command-line format |
+======================+==========================================+
| ``x`` | -x (short option) |
+----------------------+------------------------------------------+
| ``x:`` or ``x=`` | -x (short option with value) |
+----------------------+------------------------------------------+
| ``xyz`` | --xyz (long option) |
+----------------------+------------------------------------------+
| ``xyz:`` or ``xyz=`` | --xyz (long option with value) |
+----------------------+------------------------------------------+
{commandlist} is a list of items of the form {cmdstr} or ``(cmdstr, descr)``,
where {descr} is as above. The {cmdstr}\ s will appear in a popup menu. When
chosen, the text of {cmdstr} will be appended to the command line as is, except
that a trailing ``':'`` or ``'='`` (if present) will be trimmed off.
.. versionadded:: 2.0
AskFileForOpen( [message] [, typeList] [, defaultLocation] [, defaultOptionFlags] [, location] [, clientName] [, windowTitle] [, actionButtonLabel] [, cancelButtonLabel] [, preferenceKey] [, popupExtension] [, eventProc] [, previewProc] [, filterProc] [, wanted] )~
Post a dialog asking the user for a file to open, and return the file selected
or None if the user cancelled. {message} is a text message to display,
{typeList} is a list of 4-char filetypes allowable, {defaultLocation} is the
pathname, FSSpec or FSRef of the folder to show initially,
{location} is the ``(x, y)`` position on the screen where the dialog is shown,
{actionButtonLabel} is a string to show instead of "Open" in the OK button,
{cancelButtonLabel} is a string to show instead of "Cancel" in the cancel
button, {wanted} is the type of value wanted as a return: str,
unicode, FSSpec, FSRef and subtypes thereof are
acceptable.
.. index:: single: Navigation Services
For a description of the other arguments please see the Apple Navigation
Services documentation and the EasyDialogs (|py2stdlib-easydialogs|) source code.
AskFileForSave( [message] [, savedFileName] [, defaultLocation] [, defaultOptionFlags] [, location] [, clientName] [, windowTitle] [, actionButtonLabel] [, cancelButtonLabel] [, preferenceKey] [, popupExtension] [, fileType] [, fileCreator] [, eventProc] [, wanted] )~
Post a dialog asking the user for a file to save to, and return the file
selected or None if the user cancelled. {savedFileName} is the default
for the file name to save to (the return value). See AskFileForOpen for
a description of the other arguments.
AskFolder( [message] [, defaultLocation] [, defaultOptionFlags] [, location] [, clientName] [, windowTitle] [, actionButtonLabel] [, cancelButtonLabel] [, preferenceKey] [, popupExtension] [, eventProc] [, filterProc] [, wanted] )~
Post a dialog asking the user to select a folder, and return the folder selected
or None if the user cancelled. See AskFileForOpen for a
description of the arguments.
.. seealso::
`Navigation Services Reference <http://developer.apple.com/documentation/Carbon/Reference/Navigation_Services_Ref/>`_
Programmer's reference documentation for the Navigation Services, a part of the
Carbon framework.
ProgressBar Objects
-------------------
ProgressBar objects provide support for modeless progress-bar dialogs.
Both determinate (thermometer style) and indeterminate (barber-pole style)
progress bars are supported. The bar will be determinate if its maximum value
is greater than zero; otherwise it will be indeterminate.
.. versionchanged:: 2.2
Support for indeterminate-style progress bars was added.
The dialog is displayed immediately after creation. If the dialog's "Cancel"
button is pressed, or if Cmd-. or ESC is typed, the dialog window
is hidden and KeyboardInterrupt is raised (but note that this response
does not occur until the progress bar is next updated, typically via a call to
inc or set). Otherwise, the bar remains visible until the
ProgressBar object is discarded.
ProgressBar objects possess the following attributes and methods:
ProgressBar.curval~
The current value (of type integer or long integer) of the progress bar. The
normal access methods coerce curval between ``0`` and maxval.
This attribute should not be altered directly.
ProgressBar.maxval~
The maximum value (of type integer or long integer) of the progress bar; the
progress bar (thermometer style) is full when curval equals
maxval. If maxval is ``0``, the bar will be indeterminate
(barber-pole). This attribute should not be altered directly.
ProgressBar.title([newstr])~
Sets the text in the title bar of the progress dialog to {newstr}.
ProgressBar.label([newstr])~
Sets the text in the progress box of the progress dialog to {newstr}.
ProgressBar.set(value[, max])~
Sets the progress bar's curval to {value}, and also maxval to
{max} if the latter is provided. {value} is first coerced between 0 and
maxval. The thermometer bar is updated to reflect the changes,
including a change from indeterminate to determinate or vice versa.
ProgressBar.inc([n])~
Increments the progress bar's curval by {n}, or by ``1`` if {n} is not
provided. (Note that {n} may be negative, in which case the effect is a
decrement.) The progress bar is updated to reflect the change. If the bar is
indeterminate, this causes one "spin" of the barber pole. The resulting
curval is coerced between 0 and maxval if incrementing causes it
to fall outside this range.
==============================================================================
*py2stdlib-email.charset*
email.charset~
:synopsis: Character Sets
This module provides a class Charset for representing character sets
and character set conversions in email messages, as well as a character set
registry and several convenience methods for manipulating this registry.
Instances of Charset are used in several other modules within the
email (|py2stdlib-email|) package.
Import this class from the email.charset (|py2stdlib-email.charset|) module.
.. versionadded:: 2.2.2
Charset([input_charset])~
Map character sets to their email properties.
This class provides information about the requirements imposed on email for a
specific character set. It also provides convenience routines for converting
between character sets, given the availability of the applicable codecs. Given
a character set, it will do its best to provide information on how to use that
character set in an email message in an RFC-compliant way.
Certain character sets must be encoded with quoted-printable or base64 when used
in email headers or bodies. Certain character sets must be converted outright,
and are not allowed in email.
Optional {input_charset} is as described below; it is always coerced to lower
case. After being alias normalized it is also used as a lookup into the
registry of character sets to find out the header encoding, body encoding, and
output conversion codec to be used for the character set. For example, if
{input_charset} is ``iso-8859-1``, then headers and bodies will be encoded using
quoted-printable and no output conversion codec is necessary. If
{input_charset} is ``euc-jp``, then headers will be encoded with base64, bodies
will not be encoded, but output text will be converted from the ``euc-jp``
character set to the ``iso-2022-jp`` character set.
Charset instances have the following data attributes:
input_charset~
The initial character set specified. Common aliases are converted to
their {official} email names (e.g. ``latin_1`` is converted to
``iso-8859-1``). Defaults to 7-bit ``us-ascii``.
header_encoding~
If the character set must be encoded before it can be used in an email
header, this attribute will be set to ``Charset.QP`` (for
quoted-printable), ``Charset.BASE64`` (for base64 encoding), or
``Charset.SHORTEST`` for the shortest of QP or BASE64 encoding. Otherwise,
it will be ``None``.
body_encoding~
Same as {header_encoding}, but describes the encoding for the mail
message's body, which indeed may be different than the header encoding.
``Charset.SHORTEST`` is not allowed for {body_encoding}.
output_charset~
Some character sets must be converted before they can be used in email headers
or bodies. If the {input_charset} is one of them, this attribute will
contain the name of the character set output will be converted to. Otherwise, it will
be ``None``.
input_codec~
The name of the Python codec used to convert the {input_charset} to
Unicode. If no conversion codec is necessary, this attribute will be
``None``.
output_codec~
The name of the Python codec used to convert Unicode to the
{output_charset}. If no conversion codec is necessary, this attribute
will have the same value as the {input_codec}.
Charset instances also have the following methods:
get_body_encoding()~
Return the content transfer encoding used for body encoding.
This is either the string ``quoted-printable`` or ``base64`` depending on
the encoding used, or it is a function, in which case you should call the
function with a single argument, the Message object being encoded. The
function should then set the Content-Transfer-Encoding
header itself to whatever is appropriate.
Returns the string ``quoted-printable`` if {body_encoding} is ``QP``,
returns the string ``base64`` if {body_encoding} is ``BASE64``, and
returns the string ``7bit`` otherwise.
convert(s)~
Convert the string {s} from the {input_codec} to the {output_codec}.
to_splittable(s)~
Convert a possibly multibyte string to a safely splittable format. {s} is
the string to split.
Uses the {input_codec} to try and convert the string to Unicode, so it can
be safely split on character boundaries (even for multibyte characters).
Returns the string as-is if it isn't known how to convert {s} to Unicode
with the {input_charset}.
Characters that could not be converted to Unicode will be replaced with
the Unicode replacement character ``'U+FFFD'``.
from_splittable(ustr[, to_output])~
Convert a splittable string back into an encoded string. {ustr} is a
Unicode string to "unsplit".
This method uses the proper codec to try and convert the string from
Unicode back into an encoded format. Return the string as-is if it is not
Unicode, or if it could not be converted from Unicode.
Characters that could not be converted from Unicode will be replaced with
an appropriate character (usually ``'?'``).
If {to_output} is ``True`` (the default), uses {output_codec} to convert
to an encoded format. If {to_output} is ``False``, it uses {input_codec}.
get_output_charset()~
Return the output character set.
This is the {output_charset} attribute if that is not ``None``, otherwise
it is {input_charset}.
encoded_header_len()~
Return the length of the encoded header string, properly calculating for
quoted-printable or base64 encoding.
header_encode(s[, convert])~
Header-encode the string {s}.
If {convert} is ``True``, the string will be converted from the input
charset to the output charset automatically. This is not useful for
multibyte character sets, which have line length issues (multibyte
characters must be split on a character, not a byte boundary); use the
higher-level email.header.Header class to deal with these issues
(see email.header (|py2stdlib-email.header|)). {convert} defaults to ``False``.
The type of encoding (base64 or quoted-printable) will be based on the
{header_encoding} attribute.
body_encode(s[, convert])~
Body-encode the string {s}.
If {convert} is ``True`` (the default), the string will be converted from
the input charset to output charset automatically. Unlike
header_encode, there are no issues with byte boundaries and
multibyte charsets in email bodies, so this is usually pretty safe.
The type of encoding (base64 or quoted-printable) will be based on the
{body_encoding} attribute.
The Charset class also provides a number of methods to support
standard operations and built-in functions.
__str__()~
Returns {input_charset} as a string coerced to lower
case. __repr__ is an alias for __str__.
__eq__(other)~
This method allows you to compare two Charset instances for
equality.
__ne__(other)~
This method allows you to compare two Charset instances for
inequality.
The email.charset (|py2stdlib-email.charset|) module also provides the following functions for adding
new entries to the global character set, alias, and codec registries:
add_charset(charset[, header_enc[, body_enc[, output_charset]]])~
Add character properties to the global registry.
{charset} is the input character set, and must be the canonical name of a
character set.
Optional {header_enc} and {body_enc} is either ``Charset.QP`` for
quoted-printable, ``Charset.BASE64`` for base64 encoding,
``Charset.SHORTEST`` for the shortest of quoted-printable or base64 encoding,
or ``None`` for no encoding. ``SHORTEST`` is only valid for
{header_enc}. The default is ``None`` for no encoding.
Optional {output_charset} is the character set that the output should be in.
Conversions will proceed from input charset, to Unicode, to the output charset
when the method Charset.convert is called. The default is to output in
the same character set as the input.
Both {input_charset} and {output_charset} must have Unicode codec entries in the
module's character set-to-codec mapping; use add_codec to add codecs the
module does not know about. See the codecs (|py2stdlib-codecs|) module's documentation for
more information.
The global character set registry is kept in the module global dictionary
``CHARSETS``.
add_alias(alias, canonical)~
Add a character set alias. {alias} is the alias name, e.g. ``latin-1``.
{canonical} is the character set's canonical name, e.g. ``iso-8859-1``.
The global charset alias registry is kept in the module global dictionary
``ALIASES``.
add_codec(charset, codecname)~
Add a codec that map characters in the given character set to and from Unicode.
{charset} is the canonical name of a character set. {codecname} is the name of a
Python codec, as appropriate for the second argument to the unicode
built-in, or to the encode method of a Unicode string.
==============================================================================
*py2stdlib-email.encoders*
email.encoders~
:synopsis: Encoders for email message payloads.
When creating email.message.Message objects from scratch, you often
need to encode the payloads for transport through compliant mail servers. This
is especially true for image/\{ and text/\} type messages
containing binary data.
The email (|py2stdlib-email|) package provides some convenient encodings in its
encoders module. These encoders are actually used by the
email.mime.audio.MIMEAudio and email.mime.image.MIMEImage
class constructors to provide default encodings. All encoder functions take
exactly one argument, the message object to encode. They usually extract the
payload, encode it, and reset the payload to this newly encoded value. They
should also set the Content-Transfer-Encoding header as appropriate.
Here are the encoding functions provided:
encode_quopri(msg)~
Encodes the payload into quoted-printable form and sets the
Content-Transfer-Encoding header to ``quoted-printable`` [#]_.
This is a good encoding to use when most of your payload is normal printable
data, but contains a few unprintable characters.
encode_base64(msg)~
Encodes the payload into base64 form and sets the
Content-Transfer-Encoding header to ``base64``. This is a good
encoding to use when most of your payload is unprintable data since it is a more
compact form than quoted-printable. The drawback of base64 encoding is that it
renders the text non-human readable.
encode_7or8bit(msg)~
This doesn't actually modify the message's payload, but it does set the
Content-Transfer-Encoding header to either ``7bit`` or ``8bit`` as
appropriate, based on the payload data.
encode_noop(msg)~
This does nothing; it doesn't even set the
Content-Transfer-Encoding header.
.. rubric:: Footnotes
.. [#] Note that encoding with encode_quopri also encodes all tabs and space
characters in the data.
==============================================================================
*py2stdlib-email.errors*
email.errors~
:synopsis: The exception classes used by the email package.
The following exception classes are defined in the email.errors (|py2stdlib-email.errors|) module:
MessageError()~
This is the base class for all exceptions that the email (|py2stdlib-email|) package can
raise. It is derived from the standard Exception class and defines no
additional methods.
MessageParseError()~
This is the base class for exceptions thrown by the email.parser.Parser
class. It is derived from MessageError.
HeaderParseError()~
Raised under some error conditions when parsing the 2822 headers of a
message, this class is derived from MessageParseError. It can be raised
from the Parser.parse or Parser.parsestr methods.
Situations where it can be raised include finding an envelope header after the
first 2822 header of the message, finding a continuation line before the
first 2822 header is found, or finding a line in the headers which is
neither a header or a continuation line.
BoundaryError()~
Raised under some error conditions when parsing the 2822 headers of a
message, this class is derived from MessageParseError. It can be raised
from the Parser.parse or Parser.parsestr methods.
Situations where it can be raised include not being able to find the starting or
terminating boundary in a multipart/\* message when strict parsing
is used.
MultipartConversionError()~
Raised when a payload is added to a Message object using
add_payload, but the payload is already a scalar and the message's
Content-Type main type is not either multipart or
missing. MultipartConversionError multiply inherits from
MessageError and the built-in TypeError.
Since Message.add_payload is deprecated, this exception is rarely raised
in practice. However the exception may also be raised if the attach
method is called on an instance of a class derived from
email.mime.nonmultipart.MIMENonMultipart (e.g.
email.mime.image.MIMEImage).
Here's the list of the defects that the email.mime.parser.FeedParser
can find while parsing messages. Note that the defects are added to the message
where the problem was found, so for example, if a message nested inside a
multipart/alternative had a malformed header, that nested message
object would have a defect, but the containing messages would not.
All defect classes are subclassed from email.errors.MessageDefect, but
this class is {not} an exception!
.. versionadded:: 2.4
All the defect classes were added.
* NoBoundaryInMultipartDefect -- A message claimed to be a multipart,
but had no boundary parameter.
* StartBoundaryNotFoundDefect -- The start boundary claimed in the
Content-Type header was never found.
* FirstHeaderLineIsContinuationDefect -- The message had a continuation
line as its first header line.
* MisplacedEnvelopeHeaderDefect - A "Unix From" header was found in the
middle of a header block.
* MalformedHeaderDefect -- A header was found that was missing a colon,
or was otherwise malformed.
* MultipartInvariantViolationDefect -- A message claimed to be a
multipart, but no subparts were found. Note that when a message has
this defect, its is_multipart method may return false even though its
content type claims to be multipart.
==============================================================================
*py2stdlib-email.generator*
email.generator~
:synopsis: Generate flat text email messages from a message structure.
One of the most common tasks is to generate the flat text of the email message
represented by a message object structure. You will need to do this if you want
to send your message via the smtplib (|py2stdlib-smtplib|) module or the nntplib (|py2stdlib-nntplib|) module,
or print the message on the console. Taking a message object structure and
producing a flat text document is the job of the Generator class.
Again, as with the email.parser (|py2stdlib-email.parser|) module, you aren't limited to the
functionality of the bundled generator; you could write one from scratch
yourself. However the bundled generator knows how to generate most email in a
standards-compliant way, should handle MIME and non-MIME email messages just
fine, and is designed so that the transformation from flat text, to a message
structure via the email.parser.Parser class, and back to flat text,
is idempotent (the input is identical to the output). On the other hand, using
the Generator on a email.message.Message constructed by program may
result in changes to the email.message.Message object as defaults are
filled in.
Here are the public methods of the Generator class, imported from the
email.generator (|py2stdlib-email.generator|) module:
Generator(outfp[, mangle_from_[, maxheaderlen]])~
The constructor for the Generator class takes a file-like object called
{outfp} for an argument. {outfp} must support the write method and be
usable as the output file in a Python extended print statement.
Optional {mangle_from_} is a flag that, when ``True``, puts a ``>`` character in
front of any line in the body that starts exactly as ``From``, i.e. ``From``
followed by a space at the beginning of the line. This is the only guaranteed
portable way to avoid having such lines be mistaken for a Unix mailbox format
envelope header separator (see `WHY THE CONTENT-LENGTH FORMAT IS BAD
<http://www.jwz.org/doc/content-length.html>`_ for details). {mangle_from_}
defaults to ``True``, but you might want to set this to ``False`` if you are not
writing Unix mailbox format files.
Optional {maxheaderlen} specifies the longest length for a non-continued header.
When a header line is longer than {maxheaderlen} (in characters, with tabs
expanded to 8 spaces), the header will be split as defined in the
email.header.Header class. Set to zero to disable header wrapping.
The default is 78, as recommended (but not required) by 2822.
The other public Generator methods are:
flatten(msg[, unixfrom])~
Print the textual representation of the message object structure rooted at
{msg} to the output file specified when the Generator instance
was created. Subparts are visited depth-first and the resulting text will
be properly MIME encoded.
Optional {unixfrom} is a flag that forces the printing of the envelope
header delimiter before the first 2822 header of the root message
object. If the root object has no envelope header, a standard one is
crafted. By default, this is set to ``False`` to inhibit the printing of
the envelope delimiter.
Note that for subparts, no envelope header is ever printed.
.. versionadded:: 2.2.2
clone(fp)~
Return an independent clone of this Generator instance with the
exact same options.
.. versionadded:: 2.2.2
write(s)~
Write the string {s} to the underlying file object, i.e. {outfp} passed to
Generator's constructor. This provides just enough file-like API
for Generator instances to be used in extended print statements.
As a convenience, see the methods Message.as_string and
``str(aMessage)``, a.k.a. Message.__str__, which simplify the generation
of a formatted string representation of a message object. For more detail, see
email.message (|py2stdlib-email.message|).
The email.generator (|py2stdlib-email.generator|) module also provides a derived class, called
DecodedGenerator which is like the Generator base class,
except that non-\ text parts are substituted with a format string
representing the part.
DecodedGenerator(outfp[, mangle_from_[, maxheaderlen[, fmt]]])~
This class, derived from Generator walks through all the subparts of a
message. If the subpart is of main type text, then it prints the
decoded payload of the subpart. Optional {_mangle_from_} and {maxheaderlen} are
as with the Generator base class.
If the subpart is not of main type text, optional {fmt} is a format
string that is used instead of the message payload. {fmt} is expanded with the
following keywords, ``%(keyword)s`` format:
* ``type`` -- Full MIME type of the non-\ text part
* ``maintype`` -- Main MIME type of the non-\ text part
* ``subtype`` -- Sub-MIME type of the non-\ text part
* ``filename`` -- Filename of the non-\ text part
* ``description`` -- Description associated with the non-\ text part
* ``encoding`` -- Content transfer encoding of the non-\ text part
The default value for {fmt} is ``None``, meaning :: >
[Non-text (%(type)s) part of message omitted, filename %(filename)s]
<
.. versionadded:: 2.2.2
.. versionchanged:: 2.5
The previously deprecated method __call__ was removed.
==============================================================================
*py2stdlib-email.header*
email.header~
:synopsis: Representing non-ASCII headers
2822 is the base standard that describes the format of email messages.
It derives from the older 822 standard which came into widespread use at
a time when most email was composed of ASCII characters only. 2822 is a
specification written assuming email contains only 7-bit ASCII characters.
Of course, as email has been deployed worldwide, it has become
internationalized, such that language specific character sets can now be used in
email messages. The base standard still requires email messages to be
transferred using only 7-bit ASCII characters, so a slew of RFCs have been
written describing how to encode email containing non-ASCII characters into
2822\ -compliant format. These RFCs include 2045, 2046,
2047, and 2231. The email (|py2stdlib-email|) package supports these standards
in its email.header (|py2stdlib-email.header|) and email.charset (|py2stdlib-email.charset|) modules.
If you want to include non-ASCII characters in your email headers, say in the
Subject or To fields, you should use the
Header class and assign the field in the email.message.Message
object to an instance of Header instead of using a string for the header
value. Import the Header class from the email.header (|py2stdlib-email.header|) module.
For example:: >
>>> from email.message import Message
>>> from email.header import Header
>>> msg = Message()
>>> h = Header('p\xf6stal', 'iso-8859-1')
>>> msg['Subject'] = h
>>> print msg.as_string()
Subject: =?iso-8859-1?q?p=F6stal?=
<
Notice here how we wanted the Subject field to contain a non-ASCII
character? We did this by creating a Header instance and passing in
the character set that the byte string was encoded in. When the subsequent
email.message.Message instance was flattened, the Subject
field was properly 2047 encoded. MIME-aware mail readers would show this
header using the embedded ISO-8859-1 character.
.. versionadded:: 2.2.2
Here is the Header class description:
Header([s[, charset[, maxlinelen[, header_name[, continuation_ws[, errors]]]]]])~
Create a MIME-compliant header that can contain strings in different character
sets.
Optional {s} is the initial header value. If ``None`` (the default), the
initial header value is not set. You can later append to the header with
append method calls. {s} may be a byte string or a Unicode string, but
see the append documentation for semantics.
Optional {charset} serves two purposes: it has the same meaning as the {charset}
argument to the append method. It also sets the default character set
for all subsequent append calls that omit the {charset} argument. If
{charset} is not provided in the constructor (the default), the ``us-ascii``
character set is used both as {s}'s initial charset and as the default for
subsequent append calls.
The maximum line length can be specified explicit via {maxlinelen}. For
splitting the first line to a shorter value (to account for the field header
which isn't included in {s}, e.g. Subject) pass in the name of the
field in {header_name}. The default {maxlinelen} is 76, and the default value
for {header_name} is ``None``, meaning it is not taken into account for the
first line of a long, split header.
Optional {continuation_ws} must be 2822\ -compliant folding whitespace,
and is usually either a space or a hard tab character. This character will be
prepended to continuation lines. {continuation_ws} defaults to a single
space character (" ").
Optional {errors} is passed straight through to the append method.
append(s[, charset[, errors]])~
Append the string {s} to the MIME header.
Optional {charset}, if given, should be a email.charset.Charset
instance (see email.charset (|py2stdlib-email.charset|)) or the name of a character set, which
will be converted to a email.charset.Charset instance. A value
of ``None`` (the default) means that the {charset} given in the constructor
is used.
{s} may be a byte string or a Unicode string. If it is a byte string
(i.e. ``isinstance(s, str)`` is true), then {charset} is the encoding of
that byte string, and a UnicodeError will be raised if the string
cannot be decoded with that character set.
If {s} is a Unicode string, then {charset} is a hint specifying the
character set of the characters in the string. In this case, when
producing an 2822\ -compliant header using 2047 rules, the
Unicode string will be encoded using the following charsets in order:
``us-ascii``, the {charset} hint, ``utf-8``. The first character set to
not provoke a UnicodeError is used.
Optional {errors} is passed through to any unicode or
ustr.encode call, and defaults to "strict".
encode([splitchars])~
Encode a message header into an RFC-compliant format, possibly wrapping
long lines and encapsulating non-ASCII parts in base64 or quoted-printable
encodings. Optional {splitchars} is a string containing characters to
split long ASCII lines on, in rough support of 2822's *highest
level syntactic breaks*. This doesn't affect 2047 encoded lines.
The Header class also provides a number of methods to support
standard operators and built-in functions.
__str__()~
A synonym for Header.encode. Useful for ``str(aHeader)``.
__unicode__()~
A helper for the built-in unicode function. Returns the header as
a Unicode string.
__eq__(other)~
This method allows you to compare two Header instances for
equality.
__ne__(other)~
This method allows you to compare two Header instances for
inequality.
The email.header (|py2stdlib-email.header|) module also provides the following convenient functions.
decode_header(header)~
Decode a message header value without converting the character set. The header
value is in {header}.
This function returns a list of ``(decoded_string, charset)`` pairs containing
each of the decoded parts of the header. {charset} is ``None`` for non-encoded
parts of the header, otherwise a lower case string containing the name of the
character set specified in the encoded string.
Here's an example:: >
>>> from email.header import decode_header
>>> decode_header('=?iso-8859-1?q?p=F6stal?=')
[('p\xf6stal', 'iso-8859-1')]
<
make_header(decoded_seq[, maxlinelen[, header_name[, continuation_ws]]])~
Create a Header instance from a sequence of pairs as returned by
decode_header.
decode_header takes a header value string and returns a sequence of
pairs of the format ``(decoded_string, charset)`` where {charset} is the name of
the character set.
This function takes one of those sequence of pairs and returns a Header
instance. Optional {maxlinelen}, {header_name}, and {continuation_ws} are as in
the Header constructor.
==============================================================================
*py2stdlib-email.iterators*
email.iterators~
:synopsis: Iterate over a message object tree.
Iterating over a message object tree is fairly easy with the
Message.walk method. The email.iterators (|py2stdlib-email.iterators|) module provides some
useful higher level iterations over message object trees.
body_line_iterator(msg[, decode])~
This iterates over all the payloads in all the subparts of {msg}, returning the
string payloads line-by-line. It skips over all the subpart headers, and it
skips over any subpart with a payload that isn't a Python string. This is
somewhat equivalent to reading the flat text representation of the message from
a file using readline (|py2stdlib-readline|), skipping over all the intervening headers.
Optional {decode} is passed through to Message.get_payload.
typed_subpart_iterator(msg[, maintype[, subtype]])~
This iterates over all the subparts of {msg}, returning only those subparts that
match the MIME type specified by {maintype} and {subtype}.
Note that {subtype} is optional; if omitted, then subpart MIME type matching is
done only with the main type. {maintype} is optional too; it defaults to
text.
Thus, by default typed_subpart_iterator returns each subpart that has a
MIME type of text/\*.
The following function has been added as a useful debugging tool. It should
{not} be considered part of the supported public interface for the package.
_structure(msg[, fp[, level]])~
Prints an indented representation of the content types of the message object
structure. For example:: >
>>> msg = email.message_from_file(somefile)
>>> _structure(msg)
multipart/mixed
text/plain
text/plain
multipart/digest
message/rfc822
text/plain
message/rfc822
text/plain
message/rfc822
text/plain
message/rfc822
text/plain
message/rfc822
text/plain
text/plain
<
Optional {fp} is a file-like object to print the output to. It must be suitable
for Python's extended print statement. {level} is used internally.
==============================================================================
*py2stdlib-email.message*
email.message~
:synopsis: The base class representing email messages.
The central class in the email (|py2stdlib-email|) package is the Message class,
imported from the email.message (|py2stdlib-email.message|) module. It is the base class for the
email (|py2stdlib-email|) object model. Message provides the core functionality for
setting and querying header fields, and for accessing message bodies.
Conceptually, a Message object consists of {headers} and {payloads}.
Headers are 2822 style field names and values where the field name and
value are separated by a colon. The colon is not part of either the field name
or the field value.
Headers are stored and returned in case-preserving form but are matched
case-insensitively. There may also be a single envelope header, also known as
the {Unix-From} header or the ``From_`` header. The payload is either a string
in the case of simple message objects or a list of Message objects for
MIME container documents (e.g. multipart/\* and
message/rfc822).
Message objects provide a mapping style interface for accessing the
message headers, and an explicit interface for accessing both the headers and
the payload. It provides convenience methods for generating a flat text
representation of the message object tree, for accessing commonly used header
parameters, and for recursively walking over the object tree.
Here are the methods of the Message class:
Message()~
The constructor takes no arguments.
as_string([unixfrom])~
Return the entire message flattened as a string. When optional {unixfrom}
is ``True``, the envelope header is included in the returned string.
{unixfrom} defaults to ``False``. Flattening the message may trigger
changes to the Message if defaults need to be filled in to
complete the transformation to a string (for example, MIME boundaries may
be generated or modified).
Note that this method is provided as a convenience and may not always
format the message the way you want. For example, by default it mangles
lines that begin with ``From``. For more flexibility, instantiate a
email.generator.Generator instance and use its flatten
method directly. For example:: >
from cStringIO import StringIO
from email.generator import Generator
fp = StringIO()
g = Generator(fp, mangle_from_=False, maxheaderlen=60)
g.flatten(msg)
text = fp.getvalue()
<
__str__()~
Equivalent to ``as_string(unixfrom=True)``.
is_multipart()~
Return ``True`` if the message's payload is a list of sub-\
Message objects, otherwise return ``False``. When
is_multipart returns False, the payload should be a string object.
set_unixfrom(unixfrom)~
Set the message's envelope header to {unixfrom}, which should be a string.
get_unixfrom()~
Return the message's envelope header. Defaults to ``None`` if the
envelope header was never set.
attach(payload)~
Add the given {payload} to the current payload, which must be ``None`` or
a list of Message objects before the call. After the call, the
payload will always be a list of Message objects. If you want to
set the payload to a scalar object (e.g. a string), use
set_payload instead.
get_payload([i[, decode]])~
Return the current payload, which will be a list of
Message objects when is_multipart is ``True``, or a
string when is_multipart is ``False``. If the payload is a list
and you mutate the list object, you modify the message's payload in place.
With optional argument {i}, get_payload will return the {i}-th
element of the payload, counting from zero, if is_multipart is
``True``. An IndexError will be raised if {i} is less than 0 or
greater than or equal to the number of items in the payload. If the
payload is a string (i.e. is_multipart is ``False``) and {i} is
given, a TypeError is raised.
Optional {decode} is a flag indicating whether the payload should be
decoded or not, according to the Content-Transfer-Encoding
header. When ``True`` and the message is not a multipart, the payload will
be decoded if this header's value is ``quoted-printable`` or ``base64``.
If some other encoding is used, or Content-Transfer-Encoding
header is missing, or if the payload has bogus base64 data, the payload is
returned as-is (undecoded). If the message is a multipart and the
{decode} flag is ``True``, then ``None`` is returned. The default for
{decode} is ``False``.
set_payload(payload[, charset])~
Set the entire message object's payload to {payload}. It is the client's
responsibility to ensure the payload invariants. Optional {charset} sets
the message's default character set; see set_charset for details.
.. versionchanged:: 2.2.2
{charset} argument added.
set_charset(charset)~
Set the character set of the payload to {charset}, which can either be a
email.charset.Charset instance (see email.charset (|py2stdlib-email.charset|)), a
string naming a character set, or ``None``. If it is a string, it will
be converted to a email.charset.Charset instance. If {charset}
is ``None``, the ``charset`` parameter will be removed from the
Content-Type header. Anything else will generate a
TypeError.
The message will be assumed to be of type text/\*, with the
payload either in unicode or encoded with {charset.input_charset}.
It will be encoded or converted to {charset.output_charset}
and transfer encoded properly, if needed, when generating the plain text
representation of the message. MIME headers (MIME-Version,
Content-Type, Content-Transfer-Encoding) will
be added as needed.
.. versionadded:: 2.2.2
get_charset()~
Return the email.charset.Charset instance associated with the
message's payload.
.. versionadded:: 2.2.2
The following methods implement a mapping-like interface for accessing the
message's 2822 headers. Note that there are some semantic differences
between these methods and a normal mapping (i.e. dictionary) interface. For
example, in a dictionary there are no duplicate keys, but here there may be
duplicate message headers. Also, in dictionaries there is no guaranteed
order to the keys returned by keys, but in a Message object,
headers are always returned in the order they appeared in the original
message, or were added to the message later. Any header deleted and then
re-added are always appended to the end of the header list.
These semantic differences are intentional and are biased toward maximal
convenience.
Note that in all cases, any envelope header present in the message is not
included in the mapping interface.
__len__()~
Return the total number of headers, including duplicates.
__contains__(name)~
Return true if the message object has a field named {name}. Matching is
done case-insensitively and {name} should not include the trailing colon.
Used for the ``in`` operator, e.g.:: >
if 'message-id' in myMessage:
print 'Message-ID:', myMessage['message-id']
<
__getitem__(name)~
Return the value of the named header field. {name} should not include the
colon field separator. If the header is missing, ``None`` is returned; a
KeyError is never raised.
Note that if the named field appears more than once in the message's
headers, exactly which of those field values will be returned is
undefined. Use the get_all method to get the values of all the
extant named headers.
__setitem__(name, val)~
Add a header to the message with field name {name} and value {val}. The
field is appended to the end of the message's existing fields.
Note that this does {not} overwrite or delete any existing header with the same
name. If you want to ensure that the new header is the only one present in the
message with field name {name}, delete the field first, e.g.:: >
del msg['subject']
msg['subject'] = 'Python roolz!'
<
__delitem__(name)~
Delete all occurrences of the field with name {name} from the message's
headers. No exception is raised if the named field isn't present in the headers.
has_key(name)~
Return true if the message contains a header field named {name}, otherwise
return false.
keys()~
Return a list of all the message's header field names.
values()~
Return a list of all the message's field values.
items()~
Return a list of 2-tuples containing all the message's field headers and
values.
get(name[, failobj])~
Return the value of the named header field. This is identical to
__getitem__ except that optional {failobj} is returned if the
named header is missing (defaults to ``None``).
Here are some additional useful methods:
get_all(name[, failobj])~
Return a list of all the values for the field named {name}. If there are
no such named headers in the message, {failobj} is returned (defaults to
``None``).
add_header(_name, _value, {}_params)~
Extended header setting. This method is similar to __setitem__
except that additional header parameters can be provided as keyword
arguments. {_name} is the header field to add and {_value} is the
{primary} value for the header.
For each item in the keyword argument dictionary {_params}, the key is
taken as the parameter name, with underscores converted to dashes (since
dashes are illegal in Python identifiers). Normally, the parameter will
be added as ``key="value"`` unless the value is ``None``, in which case
only the key will be added.
Here's an example:: >
msg.add_header('Content-Disposition', 'attachment', filename='bud.gif')
<
This will add a header that looks like ::
Content-Disposition: attachment; filename="bud.gif"
replace_header(_name, _value)~
Replace a header. Replace the first header found in the message that
matches {_name}, retaining header order and field name case. If no
matching header was found, a KeyError is raised.
.. versionadded:: 2.2.2
get_content_type()~
Return the message's content type. The returned string is coerced to
lower case of the form maintype/subtype. If there was no
Content-Type header in the message the default type as given
by get_default_type will be returned. Since according to
2045, messages always have a default type, get_content_type
will always return a value.
2045 defines a message's default type to be text/plain
unless it appears inside a multipart/digest container, in
which case it would be message/rfc822. If the
Content-Type header has an invalid type specification,
2045 mandates that the default type be text/plain.
.. versionadded:: 2.2.2
get_content_maintype()~
Return the message's main content type. This is the maintype
part of the string returned by get_content_type.
.. versionadded:: 2.2.2
get_content_subtype()~
Return the message's sub-content type. This is the subtype
part of the string returned by get_content_type.
.. versionadded:: 2.2.2
get_default_type()~
Return the default content type. Most messages have a default content
type of text/plain, except for messages that are subparts of
multipart/digest containers. Such subparts have a default
content type of message/rfc822.
.. versionadded:: 2.2.2
set_default_type(ctype)~
Set the default content type. {ctype} should either be
text/plain or message/rfc822, although this is not
enforced. The default content type is not stored in the
Content-Type header.
.. versionadded:: 2.2.2
get_params([failobj[, header[, unquote]]])~
Return the message's Content-Type parameters, as a list.
The elements of the returned list are 2-tuples of key/value pairs, as
split on the ``'='`` sign. The left hand side of the ``'='`` is the key,
while the right hand side is the value. If there is no ``'='`` sign in
the parameter the value is the empty string, otherwise the value is as
described in get_param and is unquoted if optional {unquote} is
``True`` (the default).
Optional {failobj} is the object to return if there is no
Content-Type header. Optional {header} is the header to
search instead of Content-Type.
.. versionchanged:: 2.2.2
{unquote} argument added.
get_param(param[, failobj[, header[, unquote]]])~
Return the value of the Content-Type header's parameter
{param} as a string. If the message has no Content-Type
header or if there is no such parameter, then {failobj} is returned
(defaults to ``None``).
Optional {header} if given, specifies the message header to use instead of
Content-Type.
Parameter keys are always compared case insensitively. The return value
can either be a string, or a 3-tuple if the parameter was 2231
encoded. When it's a 3-tuple, the elements of the value are of the form
``(CHARSET, LANGUAGE, VALUE)``. Note that both ``CHARSET`` and
``LANGUAGE`` can be ``None``, in which case you should consider ``VALUE``
to be encoded in the ``us-ascii`` charset. You can usually ignore
``LANGUAGE``.
If your application doesn't care whether the parameter was encoded as in
2231, you can collapse the parameter value by calling
email.utils.collapse_rfc2231_value, passing in the return value
from get_param. This will return a suitably decoded Unicode
string whn the value is a tuple, or the original string unquoted if it
isn't. For example:: >
rawparam = msg.get_param('foo')
param = email.utils.collapse_rfc2231_value(rawparam)
<
In any case, the parameter value (either the returned string, or the
``VALUE`` item in the 3-tuple) is always unquoted, unless {unquote} is set
to ``False``.
.. versionchanged:: 2.2.2
{unquote} argument added, and 3-tuple return value possible.
set_param(param, value[, header[, requote[, charset[, language]]]])~
Set a parameter in the Content-Type header. If the
parameter already exists in the header, its value will be replaced with
{value}. If the Content-Type header as not yet been defined
for this message, it will be set to text/plain and the new
parameter value will be appended as per 2045.
Optional {header} specifies an alternative header to
Content-Type, and all parameters will be quoted as necessary
unless optional {requote} is ``False`` (the default is ``True``).
If optional {charset} is specified, the parameter will be encoded
according to 2231. Optional {language} specifies the RFC 2231
language, defaulting to the empty string. Both {charset} and {language}
should be strings.
.. versionadded:: 2.2.2
del_param(param[, header[, requote]])~
Remove the given parameter completely from the Content-Type
header. The header will be re-written in place without the parameter or
its value. All values will be quoted as necessary unless {requote} is
``False`` (the default is ``True``). Optional {header} specifies an
alternative to Content-Type.
.. versionadded:: 2.2.2
set_type(type[, header][, requote])~
Set the main type and subtype for the Content-Type
header. {type} must be a string in the form maintype/subtype,
otherwise a ValueError is raised.
This method replaces the Content-Type header, keeping all
the parameters in place. If {requote} is ``False``, this leaves the
existing header's quoting as is, otherwise the parameters will be quoted
(the default).
An alternative header can be specified in the {header} argument. When the
Content-Type header is set a MIME-Version
header is also added.
.. versionadded:: 2.2.2
get_filename([failobj])~
Return the value of the ``filename`` parameter of the
Content-Disposition header of the message. If the header
does not have a ``filename`` parameter, this method falls back to looking
for the ``name`` parameter on the Content-Type header. If
neither is found, or the header is missing, then {failobj} is returned.
The returned string will always be unquoted as per
email.utils.unquote.
get_boundary([failobj])~
Return the value of the ``boundary`` parameter of the
Content-Type header of the message, or {failobj} if either
the header is missing, or has no ``boundary`` parameter. The returned
string will always be unquoted as per email.utils.unquote.
set_boundary(boundary)~
Set the ``boundary`` parameter of the Content-Type header to
{boundary}. set_boundary will always quote {boundary} if
necessary. A HeaderParseError is raised if the message object has
no Content-Type header.
Note that using this method is subtly different than deleting the old
Content-Type header and adding a new one with the new
boundary via add_header, because set_boundary preserves
the order of the Content-Type header in the list of
headers. However, it does {not} preserve any continuation lines which may
have been present in the original Content-Type header.
get_content_charset([failobj])~
Return the ``charset`` parameter of the Content-Type header,
coerced to lower case. If there is no Content-Type header, or if
that header has no ``charset`` parameter, {failobj} is returned.
Note that this method differs from get_charset which returns the
email.charset.Charset instance for the default encoding of the message body.
.. versionadded:: 2.2.2
get_charsets([failobj])~
Return a list containing the character set names in the message. If the
message is a multipart, then the list will contain one element
for each subpart in the payload, otherwise, it will be a list of length 1.
Each item in the list will be a string which is the value of the
``charset`` parameter in the Content-Type header for the
represented subpart. However, if the subpart has no
Content-Type header, no ``charset`` parameter, or is not of
the text main MIME type, then that item in the returned list
will be {failobj}.
walk()~
The walk method is an all-purpose generator which can be used to
iterate over all the parts and subparts of a message object tree, in
depth-first traversal order. You will typically use walk as the
iterator in a ``for`` loop; each iteration returns the next subpart.
Here's an example that prints the MIME type of every part of a multipart
message structure:: >
>>> for part in msg.walk():
... print part.get_content_type()
multipart/report
text/plain
message/delivery-status
text/plain
text/plain
message/rfc822
<
.. versionchanged:: 2.5
The previously deprecated methods get_type, get_main_type, and
get_subtype were removed.
Message objects can also optionally contain two instance attributes,
which can be used when generating the plain text of a MIME message.
preamble~
The format of a MIME document allows for some text between the blank line
following the headers, and the first multipart boundary string. Normally,
this text is never visible in a MIME-aware mail reader because it falls
outside the standard MIME armor. However, when viewing the raw text of
the message, or when viewing the message in a non-MIME aware reader, this
text can become visible.
The {preamble} attribute contains this leading extra-armor text for MIME
documents. When the email.parser.Parser discovers some text
after the headers but before the first boundary string, it assigns this
text to the message's {preamble} attribute. When the
email.generator.Generator is writing out the plain text
representation of a MIME message, and it finds the
message has a {preamble} attribute, it will write this text in the area
between the headers and the first boundary. See email.parser (|py2stdlib-email.parser|) and
email.generator (|py2stdlib-email.generator|) for details.
Note that if the message object has no preamble, the {preamble} attribute
will be ``None``.
epilogue~
The {epilogue} attribute acts the same way as the {preamble} attribute,
except that it contains text that appears between the last boundary and
the end of the message.
.. versionchanged:: 2.5
You do not need to set the epilogue to the empty string in order for the
Generator to print a newline at the end of the file.
defects~
The {defects} attribute contains a list of all the problems found when
parsing this message. See email.errors (|py2stdlib-email.errors|) for a detailed description
of the possible parsing defects.
.. versionadded:: 2.4
==============================================================================
*py2stdlib-email.mime*
email.mime~
:synopsis: Build MIME messages.
Ordinarily, you get a message object structure by passing a file or some text to
a parser, which parses the text and returns the root message object. However
you can also build a complete message structure from scratch, or even individual
email.message.Message objects by hand. In fact, you can also take an
existing structure and add new email.message.Message objects, move them
around, etc. This makes a very convenient interface for slicing-and-dicing MIME
messages.
You can create a new object structure by creating email.message.Message
instances, adding attachments and all the appropriate headers manually. For MIME
messages though, the email (|py2stdlib-email|) package provides some convenient subclasses to
make things easier.
Here are the classes:
.. currentmodule:: email.mime.base
MIMEBase(_maintype, _subtype, {}_params)~
Module: email.mime.base
This is the base class for all the MIME-specific subclasses of
email.message.Message. Ordinarily you won't create instances
specifically of MIMEBase, although you could. MIMEBase
is provided primarily as a convenient base class for more specific
MIME-aware subclasses.
{_maintype} is the Content-Type major type (e.g. text
or image), and {_subtype} is the Content-Type minor
type (e.g. plain or gif). {_params} is a parameter
key/value dictionary and is passed directly to Message.add_header.
The MIMEBase class always adds a Content-Type header
(based on {_maintype}, {_subtype}, and {_params}), and a
MIME-Version header (always set to ``1.0``).
.. currentmodule:: email.mime.nonmultipart
MIMENonMultipart()~
Module: email.mime.nonmultipart
A subclass of email.mime.base.MIMEBase, this is an intermediate base
class for MIME messages that are not multipart. The primary
purpose of this class is to prevent the use of the attach method,
which only makes sense for multipart messages. If attach
is called, a email.errors.MultipartConversionError exception is raised.
.. versionadded:: 2.2.2
.. currentmodule:: email.mime.multipart
MIMEMultipart([_subtype[, boundary[, _subparts[, _params]]]])~
Module: email.mime.multipart
A subclass of email.mime.base.MIMEBase, this is an intermediate base
class for MIME messages that are multipart. Optional {_subtype}
defaults to mixed, but can be used to specify the subtype of the
message. A Content-Type header of multipart/_subtype
will be added to the message object. A MIME-Version header will
also be added.
Optional {boundary} is the multipart boundary string. When ``None`` (the
default), the boundary is calculated when needed (for example, when the
message is serialized).
{_subparts} is a sequence of initial subparts for the payload. It must be
possible to convert this sequence to a list. You can always attach new subparts
to the message by using the Message.attach method.
Additional parameters for the Content-Type header are taken from
the keyword arguments, or passed into the {_params} argument, which is a keyword
dictionary.
.. versionadded:: 2.2.2
.. currentmodule:: email.mime.application
MIMEApplication(_data[, _subtype[, _encoder[, {}_params]]])~
Module: email.mime.application
A subclass of email.mime.nonmultipart.MIMENonMultipart, the
MIMEApplication class is used to represent MIME message objects of
major type application. {_data} is a string containing the raw
byte data. Optional {_subtype} specifies the MIME subtype and defaults to
octet-stream.
Optional {_encoder} is a callable (i.e. function) which will perform the actual
encoding of the data for transport. This callable takes one argument, which is
the MIMEApplication instance. It should use get_payload and
set_payload to change the payload to encoded form. It should also add
any Content-Transfer-Encoding or other headers to the message
object as necessary. The default encoding is base64. See the
email.encoders (|py2stdlib-email.encoders|) module for a list of the built-in encoders.
{_params} are passed straight through to the base class constructor.
.. versionadded:: 2.5
.. currentmodule:: email.mime.audio
MIMEAudio(_audiodata[, _subtype[, _encoder[, {}_params]]])~
Module: email.mime.audio
A subclass of email.mime.nonmultipart.MIMENonMultipart, the
MIMEAudio class is used to create MIME message objects of major type
audio. {_audiodata} is a string containing the raw audio data. If
this data can be decoded by the standard Python module sndhdr (|py2stdlib-sndhdr|), then the
subtype will be automatically included in the Content-Type header.
Otherwise you can explicitly specify the audio subtype via the {_subtype}
parameter. If the minor type could not be guessed and {_subtype} was not given,
then TypeError is raised.
Optional {_encoder} is a callable (i.e. function) which will perform the actual
encoding of the audio data for transport. This callable takes one argument,
which is the MIMEAudio instance. It should use get_payload and
set_payload to change the payload to encoded form. It should also add
any Content-Transfer-Encoding or other headers to the message
object as necessary. The default encoding is base64. See the
email.encoders (|py2stdlib-email.encoders|) module for a list of the built-in encoders.
{_params} are passed straight through to the base class constructor.
.. currentmodule:: email.mime.image
MIMEImage(_imagedata[, _subtype[, _encoder[, {}_params]]])~
Module: email.mime.image
A subclass of email.mime.nonmultipart.MIMENonMultipart, the
MIMEImage class is used to create MIME message objects of major type
image. {_imagedata} is a string containing the raw image data. If
this data can be decoded by the standard Python module imghdr (|py2stdlib-imghdr|), then the
subtype will be automatically included in the Content-Type header.
Otherwise you can explicitly specify the image subtype via the {_subtype}
parameter. If the minor type could not be guessed and {_subtype} was not given,
then TypeError is raised.
Optional {_encoder} is a callable (i.e. function) which will perform the actual
encoding of the image data for transport. This callable takes one argument,
which is the MIMEImage instance. It should use get_payload and
set_payload to change the payload to encoded form. It should also add
any Content-Transfer-Encoding or other headers to the message
object as necessary. The default encoding is base64. See the
email.encoders (|py2stdlib-email.encoders|) module for a list of the built-in encoders.
{_params} are passed straight through to the email.mime.base.MIMEBase
constructor.
.. currentmodule:: email.mime.message
MIMEMessage(_msg[, _subtype])~
Module: email.mime.message
A subclass of email.mime.nonmultipart.MIMENonMultipart, the
MIMEMessage class is used to create MIME objects of main type
message. {_msg} is used as the payload, and must be an instance
of class email.message.Message (or a subclass thereof), otherwise
a TypeError is raised.
Optional {_subtype} sets the subtype of the message; it defaults to
rfc822 (|py2stdlib-rfc822|).
.. currentmodule:: email.mime.text
MIMEText(_text[, _subtype[, _charset]])~
Module: email.mime.text
A subclass of email.mime.nonmultipart.MIMENonMultipart, the
MIMEText class is used to create MIME objects of major type
text. {_text} is the string for the payload. {_subtype} is the
minor type and defaults to plain. {_charset} is the character
set of the text and is passed as a parameter to the
email.mime.nonmultipart.MIMENonMultipart constructor; it defaults
to ``us-ascii``. If {_text} is unicode, it is encoded using the
{output_charset} of {_charset}, otherwise it is used as-is.
.. versionchanged:: 2.4
The previously deprecated {_encoding} argument has been removed. Content
Transfer Encoding now happens happens implicitly based on the {_charset}
argument.
==============================================================================
*py2stdlib-email.parser*
email.parser~
:synopsis: Parse flat text email messages to produce a message object structure.
Message object structures can be created in one of two ways: they can be created
from whole cloth by instantiating email.message.Message objects and
stringing them together via attach and set_payload calls, or they
can be created by parsing a flat text representation of the email message.
The email (|py2stdlib-email|) package provides a standard parser that understands most email
document structures, including MIME documents. You can pass the parser a string
or a file object, and the parser will return to you the root
email.message.Message instance of the object structure. For simple,
non-MIME messages the payload of this root object will likely be a string
containing the text of the message. For MIME messages, the root object will
return ``True`` from its is_multipart method, and the subparts can be
accessed via the get_payload and walk methods.
There are actually two parser interfaces available for use, the classic
Parser API and the incremental FeedParser API. The classic
Parser API is fine if you have the entire text of the message in memory
as a string, or if the entire message lives in a file on the file system.
FeedParser is more appropriate for when you're reading the message from
a stream which might block waiting for more input (e.g. reading an email message
from a socket). The FeedParser can consume and parse the message
incrementally, and only returns the root object when you close the parser [#]_.
Note that the parser can be extended in limited ways, and of course you can
implement your own parser completely from scratch. There is no magical
connection between the email (|py2stdlib-email|) package's bundled parser and the
email.message.Message class, so your custom parser can create message
object trees any way it finds necessary.
FeedParser API
^^^^^^^^^^^^^^
.. versionadded:: 2.4
The FeedParser, imported from the email.feedparser module,
provides an API that is conducive to incremental parsing of email messages, such
as would be necessary when reading the text of an email message from a source
that can block (e.g. a socket). The FeedParser can of course be used
to parse an email message fully contained in a string or a file, but the classic
Parser API may be more convenient for such use cases. The semantics
and results of the two parser APIs are identical.
The FeedParser's API is simple; you create an instance, feed it a bunch
of text until there's no more to feed it, then close the parser to retrieve the
root message object. The FeedParser is extremely accurate when parsing
standards-compliant messages, and it does a very good job of parsing
non-compliant messages, providing information about how a message was deemed
broken. It will populate a message object's {defects} attribute with a list of
any problems it found in a message. See the email.errors (|py2stdlib-email.errors|) module for the
list of defects that it can find.
Here is the API for the FeedParser:
FeedParser([_factory])~
Create a FeedParser instance. Optional {_factory} is a no-argument
callable that will be called whenever a new message object is needed. It
defaults to the email.message.Message class.
feed(data)~
Feed the FeedParser some more data. {data} should be a string
containing one or more lines. The lines can be partial and the
FeedParser will stitch such partial lines together properly. The
lines in the string can have any of the common three line endings,
carriage return, newline, or carriage return and newline (they can even be
mixed).
close()~
Closing a FeedParser completes the parsing of all previously fed
data, and returns the root message object. It is undefined what happens
if you feed more data to a closed FeedParser.
Parser class API
^^^^^^^^^^^^^^^^
The Parser class, imported from the email.parser (|py2stdlib-email.parser|) module,
provides an API that can be used to parse a message when the complete contents
of the message are available in a string or file. The email.parser (|py2stdlib-email.parser|)
module also provides a second class, called HeaderParser which can be
used if you're only interested in the headers of the message.
HeaderParser can be much faster in these situations, since it does not
attempt to parse the message body, instead setting the payload to the raw body
as a string. HeaderParser has the same API as the Parser
class.
Parser([_class])~
The constructor for the Parser class takes an optional argument
{_class}. This must be a callable factory (such as a function or a class), and
it is used whenever a sub-message object needs to be created. It defaults to
email.message.Message (see email.message (|py2stdlib-email.message|)). The factory will
be called without arguments.
The optional {strict} flag is ignored.
2.4~
Because the Parser class is a backward compatible API wrapper
around the new-in-Python 2.4 FeedParser, {all} parsing is
effectively non-strict. You should simply stop passing a {strict} flag to
the Parser constructor.
.. versionchanged:: 2.2.2
The {strict} flag was added.
.. versionchanged:: 2.4
The {strict} flag was deprecated.
The other public Parser methods are:
parse(fp[, headersonly])~
Read all the data from the file-like object {fp}, parse the resulting
text, and return the root message object. {fp} must support both the
readline (|py2stdlib-readline|) and the read methods on file-like objects.
The text contained in {fp} must be formatted as a block of 2822
style headers and header continuation lines, optionally preceded by a
envelope header. The header block is terminated either by the end of the
data or by a blank line. Following the header block is the body of the
message (which may contain MIME-encoded subparts).
Optional {headersonly} is as with the parse method.
.. versionchanged:: 2.2.2
The {headersonly} flag was added.
parsestr(text[, headersonly])~
Similar to the parse method, except it takes a string object
instead of a file-like object. Calling this method on a string is exactly
equivalent to wrapping {text} in a StringIO (|py2stdlib-stringio|) instance first and
calling parse.
Optional {headersonly} is a flag specifying whether to stop parsing after
reading the headers or not. The default is ``False``, meaning it parses
the entire contents of the file.
.. versionchanged:: 2.2.2
The {headersonly} flag was added.
Since creating a message object structure from a string or a file object is such
a common task, two functions are provided as a convenience. They are available
in the top-level email (|py2stdlib-email|) package namespace.
.. currentmodule:: email
message_from_string(s[, _class[, strict]])~
Return a message object structure from a string. This is exactly equivalent to
``Parser().parsestr(s)``. Optional {_class} and {strict} are interpreted as
with the Parser class constructor.
.. versionchanged:: 2.2.2
The {strict} flag was added.
message_from_file(fp[, _class[, strict]])~
Return a message object structure tree from an open file object. This is
exactly equivalent to ``Parser().parse(fp)``. Optional {_class} and {strict}
are interpreted as with the Parser class constructor.
.. versionchanged:: 2.2.2
The {strict} flag was added.
Here's an example of how you might use this at an interactive Python prompt:: >
>>> import email
>>> msg = email.message_from_string(myString)
<
Additional notes
Here are some notes on the parsing semantics:
* Most non-\ multipart type messages are parsed as a single message
object with a string payload. These objects will return ``False`` for
is_multipart. Their get_payload method will return a string
object.
* All multipart type messages will be parsed as a container message
object with a list of sub-message objects for their payload. The outer
container message will return ``True`` for is_multipart and their
get_payload method will return the list of email.message.Message
subparts.
{ Most messages with a content type of message/\} (e.g.
message/delivery-status and message/rfc822) will also be
parsed as container object containing a list payload of length 1. Their
is_multipart method will return ``True``. The single element in the
list payload will be a sub-message object.
* Some non-standards compliant messages may not be internally consistent about
their multipart\ -edness. Such messages may have a
Content-Type header of type multipart, but their
is_multipart method may return ``False``. If such messages were parsed
with the FeedParser, they will have an instance of the
MultipartInvariantViolationDefect class in their {defects} attribute
list. See email.errors (|py2stdlib-email.errors|) for details.
.. rubric:: Footnotes
.. [#] As of email package version 3.0, introduced in Python 2.4, the classic
Parser was re-implemented in terms of the FeedParser, so the
semantics and results are identical between the two parsers.
==============================================================================
*py2stdlib-email*
email~
:synopsis: Package supporting the parsing, manipulating, and generating email messages,
including MIME documents.
.. Copyright (C) 2001-2007 Python Software Foundation
.. versionadded:: 2.2
The email (|py2stdlib-email|) package is a library for managing email messages, including
MIME and other 2822\ -based message documents. It subsumes most of the
functionality in several older standard modules such as rfc822 (|py2stdlib-rfc822|),
mimetools (|py2stdlib-mimetools|), multifile (|py2stdlib-multifile|), and other non-standard packages such as
mimecntl. It is specifically {not} designed to do any sending of email
messages to SMTP (2821), NNTP, or other servers; those are functions of
modules such as smtplib (|py2stdlib-smtplib|) and nntplib (|py2stdlib-nntplib|). The email (|py2stdlib-email|) package
attempts to be as RFC-compliant as possible, supporting in addition to
2822, such MIME-related RFCs as 2045, 2046, 2047,
and 2231.
The primary distinguishing feature of the email (|py2stdlib-email|) package is that it splits
the parsing and generating of email messages from the internal {object model}
representation of email. Applications using the email (|py2stdlib-email|) package deal
primarily with objects; you can add sub-objects to messages, remove sub-objects
from messages, completely re-arrange the contents, etc. There is a separate
parser and a separate generator which handles the transformation from flat text
to the object model, and then back to flat text again. There are also handy
subclasses for some common MIME object types, and a few miscellaneous utilities
that help with such common tasks as extracting and parsing message field values,
creating RFC-compliant dates, etc.
The following sections describe the functionality of the email (|py2stdlib-email|) package.
The ordering follows a progression that should be common in applications: an
email message is read as flat text from a file or other source, the text is
parsed to produce the object structure of the email message, this structure is
manipulated, and finally, the object tree is rendered back into flat text.
It is perfectly feasible to create the object structure out of whole cloth ---
i.e. completely from scratch. From there, a similar progression can be taken as
above.
Also included are detailed specifications of all the classes and modules that
the email (|py2stdlib-email|) package provides, the exception classes you might encounter
while using the email (|py2stdlib-email|) package, some auxiliary utilities, and a few
examples. For users of the older mimelib package, or previous versions
of the email (|py2stdlib-email|) package, a section on differences and porting is provided.
Contents of the email (|py2stdlib-email|) package documentation:
.. toctree::
email.message.rst
email.parser.rst
email.generator.rst
email.mime.rst
email.header.rst
email.charset.rst
email.encoders.rst
email.errors.rst
email.util.rst
email.iterators.rst
email-examples.rst
.. seealso::
Module smtplib (|py2stdlib-smtplib|)
SMTP protocol client
Module nntplib (|py2stdlib-nntplib|)
NNTP protocol client
Package History
---------------
This table describes the release history of the email package, corresponding to
the version of Python that the package was released with. For purposes of this
document, when you see a note about change or added versions, these refer to the
Python version the change was made in, {not} the email package version. This
table also describes the Python compatibility of each version of the package.
+---------------+------------------------------+-----------------------+
| email version | distributed with | compatible with |
+===============+==============================+=======================+
| 1.x | Python 2.2.0 to Python 2.2.1 | {no longer supported} |
+---------------+------------------------------+-----------------------+
| 2.5 | Python 2.2.2+ and Python 2.3 | Python 2.1 to 2.5 |
+---------------+------------------------------+-----------------------+
| 3.0 | Python 2.4 | Python 2.3 to 2.5 |
+---------------+------------------------------+-----------------------+
| 4.0 | Python 2.5 | Python 2.3 to 2.5 |
+---------------+------------------------------+-----------------------+
Here are the major differences between email (|py2stdlib-email|) version 4 and version 3:
* All modules have been renamed according to 8 standards. For example,
the version 3 module email.Message was renamed to email.message (|py2stdlib-email.message|) in
version 4.
* A new subpackage email.mime (|py2stdlib-email.mime|) was added and all the version 3
email.MIME\* modules were renamed and situated into the email.mime (|py2stdlib-email.mime|)
subpackage. For example, the version 3 module email.MIMEText was renamed
to email.mime.text.
{Note that the version 3 names will continue to work until Python 2.6}.
* The email.mime.application module was added, which contains the
MIMEApplication class.
* Methods that were deprecated in version 3 have been removed. These include
Generator.__call__, Message.get_type,
Message.get_main_type, Message.get_subtype.
* Fixes have been added for 2231 support which can change some of the
return types for Message.get_param and friends. Under some
circumstances, values which used to return a 3-tuple now return simple strings
(specifically, if all extended parameter segments were unencoded, there is no
language and charset designation expected, so the return type is now a simple
string). Also, %-decoding used to be done for both encoded and unencoded
segments; this decoding is now done only for encoded segments.
Here are the major differences between email (|py2stdlib-email|) version 3 and version 2:
* The FeedParser class was introduced, and the Parser class
was implemented in terms of the FeedParser. All parsing therefore is
non-strict, and parsing will make a best effort never to raise an exception.
Problems found while parsing messages are stored in the message's {defect}
attribute.
* All aspects of the API which raised DeprecationWarning\ s in version 2
have been removed. These include the {_encoder} argument to the
MIMEText constructor, the Message.add_payload method, the
Utils.dump_address_pair function, and the functions Utils.decode
and Utils.encode.
* New DeprecationWarning\ s have been added to:
Generator.__call__, Message.get_type,
Message.get_main_type, Message.get_subtype, and the {strict}
argument to the Parser class. These are expected to be removed in
future versions.
* Support for Pythons earlier than 2.3 has been removed.
Here are the differences between email (|py2stdlib-email|) version 2 and version 1:
* The email.Header and email.Charset modules have been added.
* The pickle format for Message instances has changed. Since this was
never (and still isn't) formally defined, this isn't considered a backward
incompatibility. However if your application pickles and unpickles
Message instances, be aware that in email (|py2stdlib-email|) version 2,
Message instances now have private variables {_charset} and
{_default_type}.
* Several methods in the Message class have been deprecated, or their
signatures changed. Also, many new methods have been added. See the
documentation for the Message class for details. The changes should be
completely backward compatible.
* The object structure has changed in the face of message/rfc822
content types. In email (|py2stdlib-email|) version 1, such a type would be represented by a
scalar payload, i.e. the container message's is_multipart returned
false, get_payload was not a list object, but a single Message
instance.
This structure was inconsistent with the rest of the package, so the object
representation for message/rfc822 content types was changed. In
email (|py2stdlib-email|) version 2, the container {does} return ``True`` from
is_multipart, and get_payload returns a list containing a single
Message item.
Note that this is one place that backward compatibility could not be completely
maintained. However, if you're already testing the return type of
get_payload, you should be fine. You just need to make sure your code
doesn't do a set_payload with a Message instance on a container
with a content type of message/rfc822.
{ The Parser constructor's }strict* argument was added, and its
parse and parsestr methods grew a {headersonly} argument. The
{strict} flag was also added to functions email.message_from_file and
email.message_from_string.
* Generator.__call__ is deprecated; use Generator.flatten
instead. The Generator class has also grown the clone method.
* The DecodedGenerator class in the email.Generator module was
added.
* The intermediate base classes MIMENonMultipart and
MIMEMultipart have been added, and interposed in the class hierarchy
for most of the other MIME-related derived classes.
{ The }_encoder* argument to the MIMEText constructor has been
deprecated. Encoding now happens implicitly based on the {_charset} argument.
* The following functions in the email.Utils module have been deprecated:
dump_address_pairs, decode, and encode. The following
functions have been added to the module: make_msgid,
decode_rfc2231, encode_rfc2231, and decode_params.
* The non-public function email.Iterators._structure was added.
Differences from mimelib
-------------------------------
The email (|py2stdlib-email|) package was originally prototyped as a separate library called
`mimelib <http://mimelib.sf.net/>`_. Changes have been made so that method names
are more consistent, and some methods or modules have either been added or
removed. The semantics of some of the methods have also changed. For the most
part, any functionality available in mimelib is still available in the
email (|py2stdlib-email|) package, albeit often in a different way. Backward compatibility
between the mimelib package and the email (|py2stdlib-email|) package was not a
priority.
Here is a brief description of the differences between the mimelib and
the email (|py2stdlib-email|) packages, along with hints on how to port your applications.
Of course, the most visible difference between the two packages is that the
package name has been changed to email (|py2stdlib-email|). In addition, the top-level
package has the following differences:
* messageFromString has been renamed to message_from_string.
* messageFromFile has been renamed to message_from_file.
The Message class has the following differences:
* The method asString was renamed to as_string.
* The method ismultipart was renamed to is_multipart.
{ The get_payload method has grown a }decode* optional argument.
* The method getall was renamed to get_all.
* The method addheader was renamed to add_header.
* The method gettype was renamed to get_type.
* The method getmaintype was renamed to get_main_type.
* The method getsubtype was renamed to get_subtype.
* The method getparams was renamed to get_params. Also, whereas
getparams returned a list of strings, get_params returns a list
of 2-tuples, effectively the key/value pairs of the parameters, split on the
``'='`` sign.
* The method getparam was renamed to get_param.
* The method getcharsets was renamed to get_charsets.
* The method getfilename was renamed to get_filename.
* The method getboundary was renamed to get_boundary.
* The method setboundary was renamed to set_boundary.
* The method getdecodedpayload was removed. To get similar
functionality, pass the value 1 to the {decode} flag of the get_payload()
method.
* The method getpayloadastext was removed. Similar functionality is
supported by the DecodedGenerator class in the email.generator (|py2stdlib-email.generator|)
module.
* The method getbodyastext was removed. You can get similar
functionality by creating an iterator with typed_subpart_iterator in the
email.iterators (|py2stdlib-email.iterators|) module.
The Parser class has no differences in its public interface. It does
have some additional smarts to recognize message/delivery-status
type messages, which it represents as a Message instance containing
separate Message subparts for each header block in the delivery status
notification [#]_.
The Generator class has no differences in its public interface. There
is a new class in the email.generator (|py2stdlib-email.generator|) module though, called
DecodedGenerator which provides most of the functionality previously
available in the Message.getpayloadastext method.
The following modules and classes have been changed:
{ The MIMEBase class constructor arguments }_major{ and }_minor* have
changed to {_maintype} and {_subtype} respectively.
{ The ``Image`` class/module has been renamed to ``MIMEImage``. The }_minor*
argument has been renamed to {_subtype}.
{ The ``Text`` class/module has been renamed to ``MIMEText``. The }_minor*
argument has been renamed to {_subtype}.
* The ``MessageRFC822`` class/module has been renamed to ``MIMEMessage``. Note
that an earlier version of mimelib called this class/module ``RFC822``,
but that clashed with the Python standard library module rfc822 (|py2stdlib-rfc822|) on some
case-insensitive file systems.
Also, the MIMEMessage class now represents any kind of MIME message
with main type message. It takes an optional argument {_subtype}
which is used to set the MIME subtype. {_subtype} defaults to
rfc822 (|py2stdlib-rfc822|).
mimelib provided some utility functions in its address and
date modules. All of these functions have been moved to the
email.utils (|py2stdlib-email.utils|) module.
The ``MsgReader`` class/module has been removed. Its functionality is most
closely supported in the body_line_iterator function in the
email.iterators (|py2stdlib-email.iterators|) module.
.. rubric:: Footnotes
.. [#] Delivery Status Notifications (DSN) are defined in 1894.
==============================================================================
*py2stdlib-email.utils*
email.utils~
:synopsis: Miscellaneous email package utilities.
There are several useful utilities provided in the email.utils (|py2stdlib-email.utils|) module:
quote(str)~
Return a new string with backslashes in {str} replaced by two backslashes, and
double quotes replaced by backslash-double quote.
unquote(str)~
Return a new string which is an {unquoted} version of {str}. If {str} ends and
begins with double quotes, they are stripped off. Likewise if {str} ends and
begins with angle brackets, they are stripped off.
parseaddr(address)~
Parse address -- which should be the value of some address-containing field such
as To or Cc -- into its constituent {realname} and
{email address} parts. Returns a tuple of that information, unless the parse
fails, in which case a 2-tuple of ``('', '')`` is returned.
formataddr(pair)~
The inverse of parseaddr, this takes a 2-tuple of the form ``(realname,
email_address)`` and returns the string value suitable for a To or
Cc header. If the first element of {pair} is false, then the
second element is returned unmodified.
getaddresses(fieldvalues)~
This method returns a list of 2-tuples of the form returned by ``parseaddr()``.
{fieldvalues} is a sequence of header field values as might be returned by
Message.get_all. Here's a simple example that gets all the recipients
of a message:: >
from email.utils import getaddresses
tos = msg.get_all('to', [])
ccs = msg.get_all('cc', [])
resent_tos = msg.get_all('resent-to', [])
resent_ccs = msg.get_all('resent-cc', [])
all_recipients = getaddresses(tos + ccs + resent_tos + resent_ccs)
<
parsedate(date)~
Attempts to parse a date according to the rules in 2822. however, some
mailers don't follow that format as specified, so parsedate tries to
guess correctly in such cases. {date} is a string containing an 2822
date, such as ``"Mon, 20 Nov 1995 19:12:08 -0500"``. If it succeeds in parsing
the date, parsedate returns a 9-tuple that can be passed directly to
time.mktime; otherwise ``None`` will be returned. Note that indexes 6,
7, and 8 of the result tuple are not usable.
parsedate_tz(date)~
Performs the same function as parsedate, but returns either ``None`` or
a 10-tuple; the first 9 elements make up a tuple that can be passed directly to
time.mktime, and the tenth is the offset of the date's timezone from UTC
(which is the official term for Greenwich Mean Time) [#]_. If the input string
has no timezone, the last element of the tuple returned is ``None``. Note that
indexes 6, 7, and 8 of the result tuple are not usable.
mktime_tz(tuple)~
Turn a 10-tuple as returned by parsedate_tz into a UTC timestamp. It
the timezone item in the tuple is ``None``, assume local time. Minor
deficiency: mktime_tz interprets the first 8 elements of {tuple} as a
local time and then compensates for the timezone difference. This may yield a
slight error around changes in daylight savings time, though not worth worrying
about for common use.
formatdate([timeval[, localtime][, usegmt]])~
Returns a date string as per 2822, e.g.:: >
Fri, 09 Nov 2001 01:08:47 -0000
<
Optional {timeval} if given is a floating point time value as accepted by
time.gmtime and time.localtime, otherwise the current time is
used.
Optional {localtime} is a flag that when ``True``, interprets {timeval}, and
returns a date relative to the local timezone instead of UTC, properly taking
daylight savings time into account. The default is ``False`` meaning UTC is
used.
Optional {usegmt} is a flag that when ``True``, outputs a date string with the
timezone as an ascii string ``GMT``, rather than a numeric ``-0000``. This is
needed for some protocols (such as HTTP). This only applies when {localtime} is
``False``. The default is ``False``.
.. versionadded:: 2.4
make_msgid([idstring])~
Returns a string suitable for an 2822\ -compliant
Message-ID header. Optional {idstring} if given, is a string used
to strengthen the uniqueness of the message id.
decode_rfc2231(s)~
Decode the string {s} according to 2231.
encode_rfc2231(s[, charset[, language]])~
Encode the string {s} according to 2231. Optional {charset} and
{language}, if given is the character set name and language name to use. If
neither is given, {s} is returned as-is. If {charset} is given but {language}
is not, the string is encoded using the empty string for {language}.
collapse_rfc2231_value(value[, errors[, fallback_charset]])~
When a header parameter is encoded in 2231 format,
Message.get_param may return a 3-tuple containing the character set,
language, and value. collapse_rfc2231_value turns this into a unicode
string. Optional {errors} is passed to the {errors} argument of the built-in
unicode function; it defaults to ``replace``. Optional
{fallback_charset} specifies the character set to use if the one in the
2231 header is not known by Python; it defaults to ``us-ascii``.
For convenience, if the {value} passed to collapse_rfc2231_value is not
a tuple, it should be a string and it is returned unquoted.
decode_params(params)~
Decode parameters list according to 2231. {params} is a sequence of
2-tuples containing elements of the form ``(content-type, string-value)``.
.. versionchanged:: 2.4
The dump_address_pair function has been removed; use formataddr
instead.
.. versionchanged:: 2.4
The decode function has been removed; use the
Header.decode_header method instead.
.. versionchanged:: 2.4
The encode function has been removed; use the Header.encode
method instead.
.. rubric:: Footnotes
.. [#] Note that the sign of the timezone offset is the opposite of the sign of the
``time.timezone`` variable for the same timezone; the latter variable follows
the POSIX standard while this module follows 2822.
==============================================================================
*py2stdlib-errno*
errno~
:synopsis: Standard errno system symbols.
This module makes available standard ``errno`` system symbols. The value of each
symbol is the corresponding integer value. The names and descriptions are
borrowed from linux/include/errno.h, which should be pretty
all-inclusive.
errorcode~
Dictionary providing a mapping from the errno value to the string name in the
underlying system. For instance, ``errno.errorcode[errno.EPERM]`` maps to
``'EPERM'``.
To translate a numeric error code to an error message, use os.strerror.
Of the following list, symbols that are not used on the current platform are not
defined by the module. The specific list of defined symbols is available as
``errno.errorcode.keys()``. Symbols available can include:
EPERM~
Operation not permitted
ENOENT~
No such file or directory
ESRCH~
No such process
EINTR~
Interrupted system call
EIO~
I/O error
ENXIO~
No such device or address
E2BIG~
Arg list too long
ENOEXEC~
Exec format error
EBADF~
Bad file number
ECHILD~
No child processes
EAGAIN~
Try again
ENOMEM~
Out of memory
EACCES~
Permission denied
EFAULT~
Bad address
ENOTBLK~
Block device required
EBUSY~
Device or resource busy
EEXIST~
File exists
EXDEV~
Cross-device link
ENODEV~
No such device
ENOTDIR~
Not a directory
EISDIR~
Is a directory
EINVAL~
Invalid argument
ENFILE~
File table overflow
EMFILE~
Too many open files
ENOTTY~
Not a typewriter
ETXTBSY~
Text file busy
EFBIG~
File too large
ENOSPC~
No space left on device
ESPIPE~
Illegal seek
EROFS~
Read-only file system
EMLINK~
Too many links
EPIPE~
Broken pipe
EDOM~
Math argument out of domain of func
ERANGE~
Math result not representable
EDEADLK~
Resource deadlock would occur
ENAMETOOLONG~
File name too long
ENOLCK~
No record locks available
ENOSYS~
Function not implemented
ENOTEMPTY~
Directory not empty
ELOOP~
Too many symbolic links encountered
EWOULDBLOCK~
Operation would block
ENOMSG~
No message of desired type
EIDRM~
Identifier removed
ECHRNG~
Channel number out of range
EL2NSYNC~
Level 2 not synchronized
EL3HLT~
Level 3 halted
EL3RST~
Level 3 reset
ELNRNG~
Link number out of range
EUNATCH~
Protocol driver not attached
ENOCSI~
No CSI structure available
EL2HLT~
Level 2 halted
EBADE~
Invalid exchange
EBADR~
Invalid request descriptor
EXFULL~
Exchange full
ENOANO~
No anode
EBADRQC~
Invalid request code
EBADSLT~
Invalid slot
EDEADLOCK~
File locking deadlock error
EBFONT~
Bad font file format
ENOSTR~
Device not a stream
ENODATA~
No data available
ETIME~
Timer expired
ENOSR~
Out of streams resources
ENONET~
Machine is not on the network
ENOPKG~
Package not installed
EREMOTE~
Object is remote
ENOLINK~
Link has been severed
EADV~
Advertise error
ESRMNT~
Srmount error
ECOMM~
Communication error on send
EPROTO~
Protocol error
EMULTIHOP~
Multihop attempted
EDOTDOT~
RFS specific error
EBADMSG~
Not a data message
EOVERFLOW~
Value too large for defined data type
ENOTUNIQ~
Name not unique on network
EBADFD~
File descriptor in bad state
EREMCHG~
Remote address changed
ELIBACC~
Can not access a needed shared library
ELIBBAD~
Accessing a corrupted shared library
ELIBSCN~
.lib section in a.out corrupted
ELIBMAX~
Attempting to link in too many shared libraries
ELIBEXEC~
Cannot exec a shared library directly
EILSEQ~
Illegal byte sequence
ERESTART~
Interrupted system call should be restarted
ESTRPIPE~
Streams pipe error
EUSERS~
Too many users
ENOTSOCK~
Socket operation on non-socket
EDESTADDRREQ~
Destination address required
EMSGSIZE~
Message too long
EPROTOTYPE~
Protocol wrong type for socket
ENOPROTOOPT~
Protocol not available
EPROTONOSUPPORT~
Protocol not supported
ESOCKTNOSUPPORT~
Socket type not supported
EOPNOTSUPP~
Operation not supported on transport endpoint
EPFNOSUPPORT~
Protocol family not supported
EAFNOSUPPORT~
Address family not supported by protocol
EADDRINUSE~
Address already in use
EADDRNOTAVAIL~
Cannot assign requested address
ENETDOWN~
Network is down
ENETUNREACH~
Network is unreachable
ENETRESET~
Network dropped connection because of reset
ECONNABORTED~
Software caused connection abort
ECONNRESET~
Connection reset by peer
ENOBUFS~
No buffer space available
EISCONN~
Transport endpoint is already connected
ENOTCONN~
Transport endpoint is not connected
ESHUTDOWN~
Cannot send after transport endpoint shutdown
ETOOMANYREFS~
Too many references: cannot splice
ETIMEDOUT~
Connection timed out
ECONNREFUSED~
Connection refused
EHOSTDOWN~
Host is down
EHOSTUNREACH~
No route to host
EALREADY~
Operation already in progress
EINPROGRESS~
Operation now in progress
ESTALE~
Stale NFS file handle
EUCLEAN~
Structure needs cleaning
ENOTNAM~
Not a XENIX named type file
ENAVAIL~
No XENIX semaphores available
EISNAM~
Is a named type file
EREMOTEIO~
Remote I/O error
EDQUOT~
Quota exceeded
==============================================================================
*py2stdlib-exceptions*
exceptions~
:synopsis: Standard exception classes.
Exceptions should be class objects. The exceptions are defined in the module
exceptions (|py2stdlib-exceptions|). This module never needs to be imported explicitly: the
exceptions are provided in the built-in namespace as well as the
exceptions (|py2stdlib-exceptions|) module.
.. index::
statement: try
statement: except
For class exceptions, in a try statement with an except
clause that mentions a particular class, that clause also handles any exception
classes derived from that class (but not exception classes from which {it} is
derived). Two exception classes that are not related via subclassing are never
equivalent, even if they have the same name.
.. index:: statement: raise
The built-in exceptions listed below can be generated by the interpreter or
built-in functions. Except where mentioned, they have an "associated value"
indicating the detailed cause of the error. This may be a string or a tuple
containing several items of information (e.g., an error code and a string
explaining the code). The associated value is the second argument to the
raise statement. If the exception class is derived from the standard
root class BaseException, the associated value is present as the
exception instance's args attribute.
User code can raise built-in exceptions. This can be used to test an exception
handler or to report an error condition "just like" the situation in which the
interpreter raises the same exception; but beware that there is nothing to
prevent user code from raising an inappropriate error.
The built-in exception classes can be sub-classed to define new exceptions;
programmers are encouraged to at least derive new exceptions from the
Exception class and not BaseException. More information on
defining exceptions is available in the Python Tutorial under
tut-userexceptions.
The following exceptions are only used as base classes for other exceptions.
BaseException~
The base class for all built-in exceptions. It is not meant to be directly
inherited by user-defined classes (for that use Exception). If
str or unicode is called on an instance of this class, the
representation of the argument(s) to the instance are returned or the empty
string when there were no arguments. All arguments are stored in args
as a tuple.
.. versionadded:: 2.5
Exception~
All built-in, non-system-exiting exceptions are derived from this class. All
user-defined exceptions should also be derived from this class.
.. versionchanged:: 2.5
Changed to inherit from BaseException.
StandardError~
The base class for all built-in exceptions except StopIteration,
GeneratorExit, KeyboardInterrupt and SystemExit.
StandardError itself is derived from Exception.
ArithmeticError~
The base class for those built-in exceptions that are raised for various
arithmetic errors: OverflowError, ZeroDivisionError,
FloatingPointError.
LookupError~
The base class for the exceptions that are raised when a key or index used on
a mapping or sequence is invalid: IndexError, KeyError. This
can be raised directly by codecs.lookup.
EnvironmentError~
The base class for exceptions that can occur outside the Python system:
IOError, OSError. When exceptions of this type are created with a
2-tuple, the first item is available on the instance's errno (|py2stdlib-errno|) attribute
(it is assumed to be an error number), and the second item is available on the
strerror attribute (it is usually the associated error message). The
tuple itself is also available on the args attribute.
.. versionadded:: 1.5.2
When an EnvironmentError exception is instantiated with a 3-tuple, the
first two items are available as above, while the third item is available on the
filename attribute. However, for backwards compatibility, the
args attribute contains only a 2-tuple of the first two constructor
arguments.
The filename attribute is ``None`` when this exception is created with
other than 3 arguments. The errno (|py2stdlib-errno|) and strerror attributes are
also ``None`` when the instance was created with other than 2 or 3 arguments.
In this last case, args contains the verbatim constructor arguments as a
tuple.
The following exceptions are the exceptions that are actually raised.
AssertionError~
.. index:: statement: assert
Raised when an assert statement fails.
AttributeError~
Raised when an attribute reference (see attribute-references) or
assignment fails. (When an object does not support attribute references or
attribute assignments at all, TypeError is raised.)
EOFError~
Raised when one of the built-in functions (input or raw_input)
hits an end-of-file condition (EOF) without reading any data. (N.B.: the
file.read and file.readline methods return an empty string
when they hit EOF.)
FloatingPointError~
Raised when a floating point operation fails. This exception is always defined,
but can only be raised when Python is configured with the
--with-fpectl option, or the WANT_SIGFPE_HANDLER symbol is
defined in the pyconfig.h file.
GeneratorExit~
Raise when a generator\'s close method is called. It
directly inherits from BaseException instead of StandardError since
it is technically not an error.
.. versionadded:: 2.5
.. versionchanged:: 2.6
Changed to inherit from BaseException.
IOError~
Raised when an I/O operation (such as a print statement, the built-in
open function or a method of a file object) fails for an I/O-related
reason, e.g., "file not found" or "disk full".
This class is derived from EnvironmentError. See the discussion above
for more information on exception instance attributes.
.. versionchanged:: 2.6
Changed socket.error to use this as a base class.
ImportError~
Raised when an import statement fails to find the module definition
or when a ``from ... import`` fails to find a name that is to be imported.
IndexError~
Raised when a sequence subscript is out of range. (Slice indices are silently
truncated to fall in the allowed range; if an index is not a plain integer,
TypeError is raised.)
.. XXX xref to sequences
KeyError~
Raised when a mapping (dictionary) key is not found in the set of existing keys.
.. XXX xref to mapping objects?
KeyboardInterrupt~
Raised when the user hits the interrupt key (normally Control-C or
Delete). During execution, a check for interrupts is made regularly.
Interrupts typed when a built-in function input or raw_input is
waiting for input also raise this exception. The exception inherits from
BaseException so as to not be accidentally caught by code that catches
Exception and thus prevent the interpreter from exiting.
.. versionchanged:: 2.5
Changed to inherit from BaseException.
MemoryError~
Raised when an operation runs out of memory but the situation may still be
rescued (by deleting some objects). The associated value is a string indicating
what kind of (internal) operation ran out of memory. Note that because of the
underlying memory management architecture (C's malloc function), the
interpreter may not always be able to completely recover from this situation; it
nevertheless raises an exception so that a stack traceback can be printed, in
case a run-away program was the cause.
NameError~
Raised when a local or global name is not found. This applies only to
unqualified names. The associated value is an error message that includes the
name that could not be found.
NotImplementedError~
This exception is derived from RuntimeError. In user defined base
classes, abstract methods should raise this exception when they require derived
classes to override the method.
.. versionadded:: 1.5.2
OSError~
.. index:: module: errno
This exception is derived from EnvironmentError. It is raised when a
function returns a system-related error (not for illegal argument types or
other incidental errors). The errno (|py2stdlib-errno|) attribute is a numeric error
code from errno (|py2stdlib-errno|), and the strerror attribute is the
corresponding string, as would be printed by the C function perror.
See the module errno (|py2stdlib-errno|), which contains names for the error codes defined
by the underlying operating system.
For exceptions that involve a file system path (such as chdir or
unlink), the exception instance will contain a third attribute,
filename, which is the file name passed to the function.
.. versionadded:: 1.5.2
OverflowError~
Raised when the result of an arithmetic operation is too large to be
represented. This cannot occur for long integers (which would rather raise
MemoryError than give up) and for most operations with plain integers,
which return a long integer instead. Because of the lack of standardization
of floating point exception handling in C, most floating point operations
also aren't checked.
ReferenceError~
This exception is raised when a weak reference proxy, created by the
weakref.proxy function, is used to access an attribute of the referent
after it has been garbage collected. For more information on weak references,
see the weakref (|py2stdlib-weakref|) module.
.. versionadded:: 2.2
Previously known as the weakref.ReferenceError exception.
RuntimeError~
Raised when an error is detected that doesn't fall in any of the other
categories. The associated value is a string indicating what precisely went
wrong. (This exception is mostly a relic from a previous version of the
interpreter; it is not used very much any more.)
StopIteration~
Raised by an iterator\'s iterator.next method to signal that
there are no further values. This is derived from Exception rather
than StandardError, since this is not considered an error in its
normal application.
.. versionadded:: 2.2
SyntaxError~
Raised when the parser encounters a syntax error. This may occur in an
import statement, in an exec statement, in a call to the
built-in function eval or input, or when reading the initial
script or standard input (also interactively).
Instances of this class have attributes filename, lineno,
offset and text for easier access to the details. str
of the exception instance returns only the message.
SystemError~
Raised when the interpreter finds an internal error, but the situation does not
look so serious to cause it to abandon all hope. The associated value is a
string indicating what went wrong (in low-level terms).
You should report this to the author or maintainer of your Python interpreter.
Be sure to report the version of the Python interpreter (``sys.version``; it is
also printed at the start of an interactive Python session), the exact error
message (the exception's associated value) and if possible the source of the
program that triggered the error.
SystemExit~
This exception is raised by the sys.exit function. When it is not
handled, the Python interpreter exits; no stack traceback is printed. If the
associated value is a plain integer, it specifies the system exit status (passed
to C's exit function); if it is ``None``, the exit status is zero; if
it has another type (such as a string), the object's value is printed and the
exit status is one.
Instances have an attribute code (|py2stdlib-code|) which is set to the proposed exit
status or error message (defaulting to ``None``). Also, this exception derives
directly from BaseException and not StandardError, since it is not
technically an error.
A call to sys.exit is translated into an exception so that clean-up
handlers (finally clauses of try statements) can be
executed, and so that a debugger can execute a script without running the risk
of losing control. The os._exit function can be used if it is
absolutely positively necessary to exit immediately (for example, in the child
process after a call to fork).
The exception inherits from BaseException instead of StandardError
or Exception so that it is not accidentally caught by code that catches
Exception. This allows the exception to properly propagate up and cause
the interpreter to exit.
.. versionchanged:: 2.5
Changed to inherit from BaseException.
TypeError~
Raised when an operation or function is applied to an object of inappropriate
type. The associated value is a string giving details about the type mismatch.
UnboundLocalError~
Raised when a reference is made to a local variable in a function or method, but
no value has been bound to that variable. This is a subclass of
NameError.
.. versionadded:: 2.0
UnicodeError~
Raised when a Unicode-related encoding or decoding error occurs. It is a
subclass of ValueError.
.. versionadded:: 2.0
UnicodeEncodeError~
Raised when a Unicode-related error occurs during encoding. It is a subclass of
UnicodeError.
.. versionadded:: 2.3
UnicodeDecodeError~
Raised when a Unicode-related error occurs during decoding. It is a subclass of
UnicodeError.
.. versionadded:: 2.3
UnicodeTranslateError~
Raised when a Unicode-related error occurs during translating. It is a subclass
of UnicodeError.
.. versionadded:: 2.3
ValueError~
Raised when a built-in operation or function receives an argument that has the
right type but an inappropriate value, and the situation is not described by a
more precise exception such as IndexError.
VMSError~
Only available on VMS. Raised when a VMS-specific error occurs.
WindowsError~
Raised when a Windows-specific error occurs or when the error number does not
correspond to an errno (|py2stdlib-errno|) value. The winerror and
strerror values are created from the return values of the
GetLastError and FormatMessage functions from the Windows
Platform API. The errno (|py2stdlib-errno|) value maps the winerror value to
corresponding ``errno.h`` values. This is a subclass of OSError.
.. versionadded:: 2.0
.. versionchanged:: 2.5
Previous versions put the GetLastError codes into errno (|py2stdlib-errno|).
ZeroDivisionError~
Raised when the second argument of a division or modulo operation is zero. The
associated value is a string indicating the type of the operands and the
operation.
The following exceptions are used as warning categories; see the warnings (|py2stdlib-warnings|)
module for more information.
Warning~
Base class for warning categories.
UserWarning~
Base class for warnings generated by user code.
DeprecationWarning~
Base class for warnings about deprecated features.
PendingDeprecationWarning~
Base class for warnings about features which will be deprecated in the future.
SyntaxWarning~
Base class for warnings about dubious syntax
RuntimeWarning~
Base class for warnings about dubious runtime behavior.
FutureWarning~
Base class for warnings about constructs that will change semantically in the
future.
ImportWarning~
Base class for warnings about probable mistakes in module imports.
.. versionadded:: 2.5
UnicodeWarning~
Base class for warnings related to Unicode.
.. versionadded:: 2.5
Exception hierarchy
-------------------
The class hierarchy for built-in exceptions is:
.. literalinclude:: ../../Lib/test/exception_hierarchy.txt
==============================================================================
*py2stdlib-fcntl*
fcntl~
:platform: Unix
:synopsis: The fcntl() and ioctl() system calls.
.. index::
pair: UNIX; file control
pair: UNIX; I/O control
This module performs file control and I/O control on file descriptors. It is an
interface to the fcntl (|py2stdlib-fcntl|) and ioctl Unix routines.
All functions in this module take a file descriptor {fd} as their first
argument. This can be an integer file descriptor, such as returned by
``sys.stdin.fileno()``, or a file object, such as ``sys.stdin`` itself, which
provides a fileno which returns a genuine file descriptor.
The module defines the following functions:
fcntl(fd, op[, arg])~
Perform the requested operation on file descriptor {fd} (file objects providing
a fileno method are accepted as well). The operation is defined by {op}
and is operating system dependent. These codes are also found in the
fcntl (|py2stdlib-fcntl|) module. The argument {arg} is optional, and defaults to the integer
value ``0``. When present, it can either be an integer value, or a string.
With the argument missing or an integer value, the return value of this function
is the integer return value of the C fcntl (|py2stdlib-fcntl|) call. When the argument is
a string it represents a binary structure, e.g. created by struct.pack.
The binary data is copied to a buffer whose address is passed to the C
fcntl (|py2stdlib-fcntl|) call. The return value after a successful call is the contents
of the buffer, converted to a string object. The length of the returned string
will be the same as the length of the {arg} argument. This is limited to 1024
bytes. If the information returned in the buffer by the operating system is
larger than 1024 bytes, this is most likely to result in a segmentation
violation or a more subtle data corruption.
If the fcntl (|py2stdlib-fcntl|) fails, an IOError is raised.
ioctl(fd, op[, arg[, mutate_flag]])~
This function is identical to the fcntl (|py2stdlib-fcntl|) function, except that the
operations are typically defined in the library module termios (|py2stdlib-termios|) and the
argument handling is even more complicated.
The op parameter is limited to values that can fit in 32-bits.
The parameter {arg} can be one of an integer, absent (treated identically to the
integer ``0``), an object supporting the read-only buffer interface (most likely
a plain Python string) or an object supporting the read-write buffer interface.
In all but the last case, behaviour is as for the fcntl (|py2stdlib-fcntl|) function.
If a mutable buffer is passed, then the behaviour is determined by the value of
the {mutate_flag} parameter.
If it is false, the buffer's mutability is ignored and behaviour is as for a
read-only buffer, except that the 1024 byte limit mentioned above is avoided --
so long as the buffer you pass is as least as long as what the operating system
wants to put there, things should work.
If {mutate_flag} is true, then the buffer is (in effect) passed to the
underlying ioctl system call, the latter's return code is passed back to
the calling Python, and the buffer's new contents reflect the action of the
ioctl. This is a slight simplification, because if the supplied buffer
is less than 1024 bytes long it is first copied into a static buffer 1024 bytes
long which is then passed to ioctl and copied back into the supplied
buffer.
If {mutate_flag} is not supplied, then from Python 2.5 it defaults to true,
which is a change from versions 2.3 and 2.4. Supply the argument explicitly if
version portability is a priority.
An example:: >
>>> import array, fcntl, struct, termios, os
>>> os.getpgrp()
13341
>>> struct.unpack('h', fcntl.ioctl(0, termios.TIOCGPGRP, " "))[0]
13341
>>> buf = array.array('h', [0])
>>> fcntl.ioctl(0, termios.TIOCGPGRP, buf, 1)
0
>>> buf
array('h', [13341])
<
flock(fd, op)~
Perform the lock operation {op} on file descriptor {fd} (file objects providing
a fileno method are accepted as well). See the Unix manual
flock(2) for details. (On some systems, this function is emulated
using fcntl (|py2stdlib-fcntl|).)
lockf(fd, operation, [length, [start, [whence]]])~
This is essentially a wrapper around the fcntl (|py2stdlib-fcntl|) locking calls. {fd} is
the file descriptor of the file to lock or unlock, and {operation} is one of the
following values:
* LOCK_UN -- unlock
* LOCK_SH -- acquire a shared lock
* LOCK_EX -- acquire an exclusive lock
When {operation} is LOCK_SH or LOCK_EX, it can also be
bitwise ORed with LOCK_NB to avoid blocking on lock acquisition.
If LOCK_NB is used and the lock cannot be acquired, an
IOError will be raised and the exception will have an {errno}
attribute set to EACCES or EAGAIN (depending on the
operating system; for portability, check for both values). On at least some
systems, LOCK_EX can only be used if the file descriptor refers to a
file opened for writing.
{length} is the number of bytes to lock, {start} is the byte offset at which the
lock starts, relative to {whence}, and {whence} is as with fileobj.seek,
specifically:
* 0 -- relative to the start of the file (SEEK_SET)
* 1 -- relative to the current buffer position (SEEK_CUR)
* 2 -- relative to the end of the file (SEEK_END)
The default for {start} is 0, which means to start at the beginning of the file.
The default for {length} is 0 which means to lock to the end of the file. The
default for {whence} is also 0.
Examples (all on a SVR4 compliant system):: >
import struct, fcntl, os
f = open(...)
rv = fcntl.fcntl(f, fcntl.F_SETFL, os.O_NDELAY)
lockdata = struct.pack('hhllhh', fcntl.F_WRLCK, 0, 0, 0, 0, 0)
rv = fcntl.fcntl(f, fcntl.F_SETLKW, lockdata)
<
Note that in the first example the return value variable {rv} will hold an
integer value; in the second example it will hold a string value. The structure
lay-out for the {lockdata} variable is system dependent --- therefore using the
flock call may be better.
.. seealso::
Module os (|py2stdlib-os|)
If the locking flags O_SHLOCK and O_EXLOCK are present
in the os (|py2stdlib-os|) module (on BSD only), the os.open function
provides an alternative to the lockf and flock functions.
==============================================================================
*py2stdlib-filecmp*
filecmp~
:synopsis: Compare files efficiently.
The filecmp (|py2stdlib-filecmp|) module defines functions to compare files and directories,
with various optional time/correctness trade-offs. For comparing files,
see also the difflib (|py2stdlib-difflib|) module.
The filecmp (|py2stdlib-filecmp|) module defines the following functions:
cmp(f1, f2[, shallow])~
Compare the files named {f1} and {f2}, returning ``True`` if they seem equal,
``False`` otherwise.
Unless {shallow} is given and is false, files with identical os.stat
signatures are taken to be equal.
Files that were compared using this function will not be compared again unless
their os.stat signature changes.
Note that no external programs are called from this function, giving it
portability and efficiency.
cmpfiles(dir1, dir2, common[, shallow])~
Compare the files in the two directories {dir1} and {dir2} whose names are
given by {common}.
Returns three lists of file names: {match}, {mismatch},
{errors}. {match} contains the list of files that match, {mismatch} contains
the names of those that don't, and {errors} lists the names of files which
could not be compared. Files are listed in {errors} if they don't exist in
one of the directories, the user lacks permission to read them or if the
comparison could not be done for some other reason.
The {shallow} parameter has the same meaning and default value as for
filecmp.cmp.
For example, ``cmpfiles('a', 'b', ['c', 'd/e'])`` will compare ``a/c`` with
``b/c`` and ``a/d/e`` with ``b/d/e``. ``'c'`` and ``'d/e'`` will each be in
one of the three returned lists.
Example:: >
>>> import filecmp
>>> filecmp.cmp('undoc.rst', 'undoc.rst')
True
>>> filecmp.cmp('undoc.rst', 'index.rst')
False
<
The dircmp class
dircmp instances are built using this constructor:
dircmp(a, b[, ignore[, hide]])~
Construct a new directory comparison object, to compare the directories {a} and
{b}. {ignore} is a list of names to ignore, and defaults to ``['RCS', 'CVS',
'tags']``. {hide} is a list of names to hide, and defaults to ``[os.curdir,
os.pardir]``.
The dircmp class provides the following methods:
report()~
Print (to ``sys.stdout``) a comparison between {a} and {b}.
report_partial_closure()~
Print a comparison between {a} and {b} and common immediate
subdirectories.
report_full_closure()~
Print a comparison between {a} and {b} and common subdirectories
(recursively).
The dircmp offers a number of interesting attributes that may be
used to get various bits of information about the directory trees being
compared.
Note that via __getattr__ hooks, all attributes are computed lazily,
so there is no speed penalty if only those attributes which are lightweight
to compute are used.
left_list~
Files and subdirectories in {a}, filtered by {hide} and {ignore}.
right_list~
Files and subdirectories in {b}, filtered by {hide} and {ignore}.
common~
Files and subdirectories in both {a} and {b}.
left_only~
Files and subdirectories only in {a}.
right_only~
Files and subdirectories only in {b}.
common_dirs~
Subdirectories in both {a} and {b}.
common_files~
Files in both {a} and {b}
common_funny~
Names in both {a} and {b}, such that the type differs between the
directories, or names for which os.stat reports an error.
same_files~
Files which are identical in both {a} and {b}.
diff_files~
Files which are in both {a} and {b}, whose contents differ.
funny_files~
Files which are in both {a} and {b}, but could not be compared.
subdirs~
A dictionary mapping names in common_dirs to dircmp objects.
==============================================================================
*py2stdlib-fileinput*
fileinput~
:synopsis: Loop over standard input or a list of files.
This module implements a helper class and functions to quickly write a
loop over standard input or a list of files. If you just want to read or
write one file see open.
The typical use is:: >
import fileinput
for line in fileinput.input():
process(line)
<
This iterates over the lines of all files listed in ``sys.argv[1:]``, defaulting
to ``sys.stdin`` if the list is empty. If a filename is ``'-'``, it is also
replaced by ``sys.stdin``. To specify an alternative list of filenames, pass it
as the first argument to .input. A single file name is also allowed.
All files are opened in text mode by default, but you can override this by
specifying the {mode} parameter in the call to .input or
FileInput(). If an I/O error occurs during opening or reading a file,
IOError is raised.
If ``sys.stdin`` is used more than once, the second and further use will return
no lines, except perhaps for interactive use, or if it has been explicitly reset
(e.g. using ``sys.stdin.seek(0)``).
Empty files are opened and immediately closed; the only time their presence in
the list of filenames is noticeable at all is when the last file opened is
empty.
Lines are returned with any newlines intact, which means that the last line in
a file may not have one.
You can control how files are opened by providing an opening hook via the
{openhook} parameter to fileinput.input or FileInput(). The
hook must be a function that takes two arguments, {filename} and {mode}, and
returns an accordingly opened file-like object. Two useful hooks are already
provided by this module.
The following function is the primary interface of this module:
input([files[, inplace[, backup[, mode[, openhook]]]]])~
Create an instance of the FileInput class. The instance will be used
as global state for the functions of this module, and is also returned to use
during iteration. The parameters to this function will be passed along to the
constructor of the FileInput class.
.. versionchanged:: 2.5
Added the {mode} and {openhook} parameters.
The following functions use the global state created by fileinput.input;
if there is no active state, RuntimeError is raised.
filename()~
Return the name of the file currently being read. Before the first line has
been read, returns ``None``.
fileno()~
Return the integer "file descriptor" for the current file. When no file is
opened (before the first line and between files), returns ``-1``.
.. versionadded:: 2.5
lineno()~
Return the cumulative line number of the line that has just been read. Before
the first line has been read, returns ``0``. After the last line of the last
file has been read, returns the line number of that line.
filelineno()~
Return the line number in the current file. Before the first line has been
read, returns ``0``. After the last line of the last file has been read,
returns the line number of that line within the file.
isfirstline()~
Returns true if the line just read is the first line of its file, otherwise
returns false.
isstdin()~
Returns true if the last line was read from ``sys.stdin``, otherwise returns
false.
nextfile()~
Close the current file so that the next iteration will read the first line from
the next file (if any); lines not read from the file will not count towards the
cumulative line count. The filename is not changed until after the first line
of the next file has been read. Before the first line has been read, this
function has no effect; it cannot be used to skip the first file. After the
last line of the last file has been read, this function has no effect.
close()~
Close the sequence.
The class which implements the sequence behavior provided by the module is
available for subclassing as well:
FileInput([files[, inplace[, backup[, mode[, openhook]]]]])~
Class FileInput is the implementation; its methods filename,
fileno, lineno, filelineno, isfirstline,
isstdin, nextfile and close correspond to the functions
of the same name in the module. In addition it has a readline (|py2stdlib-readline|) method
which returns the next input line, and a __getitem__ method which
implements the sequence behavior. The sequence must be accessed in strictly
sequential order; random access and readline (|py2stdlib-readline|) cannot be mixed.
With {mode} you can specify which file mode will be passed to open. It
must be one of ``'r'``, ``'rU'``, ``'U'`` and ``'rb'``.
The {openhook}, when given, must be a function that takes two arguments,
{filename} and {mode}, and returns an accordingly opened file-like object. You
cannot use {inplace} and {openhook} together.
.. versionchanged:: 2.5
Added the {mode} and {openhook} parameters.
{Optional in-place filtering:}* if the keyword argument ``inplace=1`` is passed
to fileinput.input or to the FileInput constructor, the file is
moved to a backup file and standard output is directed to the input file (if a
file of the same name as the backup file already exists, it will be replaced
silently). This makes it possible to write a filter that rewrites its input
file in place. If the {backup} parameter is given (typically as
``backup='.<some extension>'``), it specifies the extension for the backup file,
and the backup file remains around; by default, the extension is ``'.bak'`` and
it is deleted when the output file is closed. In-place filtering is disabled
when standard input is read.
.. note::
The current implementation does not work for MS-DOS 8+3 filesystems.
The two following opening hooks are provided by this module:
hook_compressed(filename, mode)~
Transparently opens files compressed with gzip and bzip2 (recognized by the
extensions ``'.gz'`` and ``'.bz2'``) using the gzip (|py2stdlib-gzip|) and bz2 (|py2stdlib-bz2|)
modules. If the filename extension is not ``'.gz'`` or ``'.bz2'``, the file is
opened normally (ie, using open without any decompression).
Usage example: ``fi = fileinput.FileInput(openhook=fileinput.hook_compressed)``
.. versionadded:: 2.5
hook_encoded(encoding)~
Returns a hook which opens each file with codecs.open, using the given
{encoding} to read the file.
Usage example: ``fi =
fileinput.FileInput(openhook=fileinput.hook_encoded("iso-8859-1"))``
.. note:: >
With this hook, FileInput might return Unicode strings depending on the
specified {encoding}.
<
.. versionadded:: 2.5
==============================================================================
*py2stdlib-fl*
fl~
:platform: IRIX
:synopsis: FORMS library for applications with graphical user interfaces.
:deprecated:
2.6~
The fl (|py2stdlib-fl|) module has been deprecated for removal in Python 3.0.
.. index::
single: FORMS Library
single: Overmars, Mark
This module provides an interface to the FORMS Library by Mark Overmars. The
source for the library can be retrieved by anonymous ftp from host
``ftp.cs.ruu.nl``, directory SGI/FORMS. It was last tested with version
2.0b.
Most functions are literal translations of their C equivalents, dropping the
initial ``fl_`` from their name. Constants used by the library are defined in
module FL (|py2stdlib-fl^|) described below.
The creation of objects is a little different in Python than in C: instead of
the 'current form' maintained by the library to which new FORMS objects are
added, all functions that add a FORMS object to a form are methods of the Python
object representing the form. Consequently, there are no Python equivalents for
the C functions fl_addto_form and fl_end_form, and the
equivalent of fl_bgn_form is called fl.make_form.
Watch out for the somewhat confusing terminology: FORMS uses the word
object for the buttons, sliders etc. that you can place in a form. In
Python, 'object' means any value. The Python interface to FORMS introduces two
new Python object types: form objects (representing an entire form) and FORMS
objects (representing one button, slider etc.). Hopefully this isn't too
confusing.
There are no 'free objects' in the Python interface to FORMS, nor is there an
easy way to add object classes written in Python. The FORMS interface to GL
event handling is available, though, so you can mix FORMS with pure GL windows.
{Please note:}* importing fl (|py2stdlib-fl|) implies a call to the GL function
foreground and to the FORMS routine fl_init.
Functions Defined in Module fl (|py2stdlib-fl|)
-------------------------------------
Module fl (|py2stdlib-fl|) defines the following functions. For more information about
what they do, see the description of the equivalent C function in the FORMS
documentation:
make_form(type, width, height)~
Create a form with given type, width and height. This returns a form
object, whose methods are described below.
do_forms()~
The standard FORMS main loop. Returns a Python object representing the FORMS
object needing interaction, or the special value FL.EVENT.
check_forms()~
Check for FORMS events. Returns what do_forms above returns, or
``None`` if there is no event that immediately needs interaction.
set_event_call_back(function)~
Set the event callback function.
set_graphics_mode(rgbmode, doublebuffering)~
Set the graphics modes.
get_rgbmode()~
Return the current rgb mode. This is the value of the C global variable
fl_rgbmode.
show_message(str1, str2, str3)~
Show a dialog box with a three-line message and an OK button.
show_question(str1, str2, str3)~
Show a dialog box with a three-line message and YES and NO buttons. It returns
``1`` if the user pressed YES, ``0`` if NO.
show_choice(str1, str2, str3, but1[, but2[, but3]])~
Show a dialog box with a three-line message and up to three buttons. It returns
the number of the button clicked by the user (``1``, ``2`` or ``3``).
show_input(prompt, default)~
Show a dialog box with a one-line prompt message and text field in which the
user can enter a string. The second argument is the default input string. It
returns the string value as edited by the user.
show_file_selector(message, directory, pattern, default)~
Show a dialog box in which the user can select a file. It returns the absolute
filename selected by the user, or ``None`` if the user presses Cancel.
get_directory()~
get_pattern()
get_filename()
These functions return the directory, pattern and filename (the tail part only)
selected by the user in the last show_file_selector call.
qdevice(dev)~
unqdevice(dev)
isqueued(dev)
qtest()
qread()
qreset()
qenter(dev, val)
get_mouse()
tie(button, valuator1, valuator2)
These functions are the FORMS interfaces to the corresponding GL functions. Use
these if you want to handle some GL events yourself when using
fl.do_events. When a GL event is detected that FORMS cannot handle,
fl.do_forms returns the special value FL.EVENT and you should
call fl.qread to read the event from the queue. Don't use the
equivalent GL functions!
.. \funcline{blkqread}{?}
color()~
mapcolor()
getmcolor()
See the description in the FORMS documentation of fl_color,
fl_mapcolor and fl_getmcolor.
Form Objects
------------
Form objects (returned by make_form above) have the following methods.
Each method corresponds to a C function whose name is prefixed with ``fl_``; and
whose first argument is a form pointer; please refer to the official FORMS
documentation for descriptions.
All the add_\* methods return a Python object representing the FORMS
object. Methods of FORMS objects are described below. Most kinds of FORMS
object also have some methods specific to that kind; these methods are listed
here.
form.show_form(placement, bordertype, name)~
Show the form.
form.hide_form()~
Hide the form.
form.redraw_form()~
Redraw the form.
form.set_form_position(x, y)~
Set the form's position.
form.freeze_form()~
Freeze the form.
form.unfreeze_form()~
Unfreeze the form.
form.activate_form()~
Activate the form.
form.deactivate_form()~
Deactivate the form.
form.bgn_group()~
Begin a new group of objects; return a group object.
form.end_group()~
End the current group of objects.
form.find_first()~
Find the first object in the form.
form.find_last()~
Find the last object in the form.
form.add_box(type, x, y, w, h, name)~
Add a box object to the form. No extra methods.
form.add_text(type, x, y, w, h, name)~
Add a text object to the form. No extra methods.
.. \begin{methoddesc}[form]{add_bitmap}{type, x, y, w, h, name}
.. Add a bitmap object to the form.
.. \end{methoddesc}
form.add_clock(type, x, y, w, h, name)~
Add a clock object to the form. --- Method: get_clock.
form.add_button(type, x, y, w, h, name)~
Add a button object to the form. --- Methods: get_button,
set_button.
form.add_lightbutton(type, x, y, w, h, name)~
Add a lightbutton object to the form. --- Methods: get_button,
set_button.
form.add_roundbutton(type, x, y, w, h, name)~
Add a roundbutton object to the form. --- Methods: get_button,
set_button.
form.add_slider(type, x, y, w, h, name)~
Add a slider object to the form. --- Methods: set_slider_value,
get_slider_value, set_slider_bounds, get_slider_bounds,
set_slider_return, set_slider_size,
set_slider_precision, set_slider_step.
form.add_valslider(type, x, y, w, h, name)~
Add a valslider object to the form. --- Methods: set_slider_value,
get_slider_value, set_slider_bounds, get_slider_bounds,
set_slider_return, set_slider_size,
set_slider_precision, set_slider_step.
form.add_dial(type, x, y, w, h, name)~
Add a dial object to the form. --- Methods: set_dial_value,
get_dial_value, set_dial_bounds, get_dial_bounds.
form.add_positioner(type, x, y, w, h, name)~
Add a positioner object to the form. --- Methods:
set_positioner_xvalue, set_positioner_yvalue,
set_positioner_xbounds, set_positioner_ybounds,
get_positioner_xvalue, get_positioner_yvalue,
get_positioner_xbounds, get_positioner_ybounds.
form.add_counter(type, x, y, w, h, name)~
Add a counter object to the form. --- Methods: set_counter_value,
get_counter_value, set_counter_bounds, set_counter_step,
set_counter_precision, set_counter_return.
form.add_input(type, x, y, w, h, name)~
Add a input object to the form. --- Methods: set_input,
get_input, set_input_color, set_input_return.
form.add_menu(type, x, y, w, h, name)~
Add a menu object to the form. --- Methods: set_menu,
get_menu, addto_menu.
form.add_choice(type, x, y, w, h, name)~
Add a choice object to the form. --- Methods: set_choice,
get_choice, clear_choice, addto_choice,
replace_choice, delete_choice, get_choice_text,
set_choice_fontsize, set_choice_fontstyle.
form.add_browser(type, x, y, w, h, name)~
Add a browser object to the form. --- Methods: set_browser_topline,
clear_browser, add_browser_line, addto_browser,
insert_browser_line, delete_browser_line,
replace_browser_line, get_browser_line, load_browser,
get_browser_maxline, select_browser_line,
deselect_browser_line, deselect_browser,
isselected_browser_line, get_browser,
set_browser_fontsize, set_browser_fontstyle,
set_browser_specialkey.
form.add_timer(type, x, y, w, h, name)~
Add a timer object to the form. --- Methods: set_timer,
get_timer.
Form objects have the following data attributes; see the FORMS documentation:
+---------------------+-----------------+--------------------------------+
| Name | C Type | Meaning |
+=====================+=================+================================+
| window | int (read-only) | GL window id |
+---------------------+-----------------+--------------------------------+
| w | float | form width |
+---------------------+-----------------+--------------------------------+
| h | float | form height |
+---------------------+-----------------+--------------------------------+
| x | float | form x origin |
+---------------------+-----------------+--------------------------------+
| y | float | form y origin |
+---------------------+-----------------+--------------------------------+
| deactivated | int | nonzero if form is deactivated |
+---------------------+-----------------+--------------------------------+
| visible | int | nonzero if form is visible |
+---------------------+-----------------+--------------------------------+
| frozen | int | nonzero if form is frozen |
+---------------------+-----------------+--------------------------------+
| doublebuf | int | nonzero if double buffering on |
+---------------------+-----------------+--------------------------------+
FORMS Objects
-------------
Besides methods specific to particular kinds of FORMS objects, all FORMS objects
also have the following methods:
FORMS object.set_call_back(function, argument)~
Set the object's callback function and argument. When the object needs
interaction, the callback function will be called with two arguments: the
object, and the callback argument. (FORMS objects without a callback function
are returned by fl.do_forms or fl.check_forms when they need
interaction.) Call this method without arguments to remove the callback
function.
FORMS object.delete_object()~
Delete the object.
FORMS object.show_object()~
Show the object.
FORMS object.hide_object()~
Hide the object.
FORMS object.redraw_object()~
Redraw the object.
FORMS object.freeze_object()~
Freeze the object.
FORMS object.unfreeze_object()~
Unfreeze the object.
FORMS objects have these data attributes; see the FORMS documentation:
.. \begin{methoddesc}[FORMS object]{handle_object}{} XXX
.. \end{methoddesc}
.. \begin{methoddesc}[FORMS object]{handle_object_direct}{} XXX
.. \end{methoddesc}
+--------------------+-----------------+------------------+
| Name | C Type | Meaning |
+====================+=================+==================+
| objclass | int (read-only) | object class |
+--------------------+-----------------+------------------+
| type | int (read-only) | object type |
+--------------------+-----------------+------------------+
| boxtype | int | box type |
+--------------------+-----------------+------------------+
| x | float | x origin |
+--------------------+-----------------+------------------+
| y | float | y origin |
+--------------------+-----------------+------------------+
| w | float | width |
+--------------------+-----------------+------------------+
| h | float | height |
+--------------------+-----------------+------------------+
| col1 | int | primary color |
+--------------------+-----------------+------------------+
| col2 | int | secondary color |
+--------------------+-----------------+------------------+
| align | int | alignment |
+--------------------+-----------------+------------------+
| lcol | int | label color |
+--------------------+-----------------+------------------+
| lsize | float | label font size |
+--------------------+-----------------+------------------+
| label | string | label string |
+--------------------+-----------------+------------------+
| lstyle | int | label style |
+--------------------+-----------------+------------------+
| pushed | int (read-only) | (see FORMS docs) |
+--------------------+-----------------+------------------+
| focus | int (read-only) | (see FORMS docs) |
+--------------------+-----------------+------------------+
| belowmouse | int (read-only) | (see FORMS docs) |
+--------------------+-----------------+------------------+
| frozen | int (read-only) | (see FORMS docs) |
+--------------------+-----------------+------------------+
| active | int (read-only) | (see FORMS docs) |
+--------------------+-----------------+------------------+
| input | int (read-only) | (see FORMS docs) |
+--------------------+-----------------+------------------+
| visible | int (read-only) | (see FORMS docs) |
+--------------------+-----------------+------------------+
| radio | int (read-only) | (see FORMS docs) |
+--------------------+-----------------+------------------+
| automatic | int (read-only) | (see FORMS docs) |
+--------------------+-----------------+------------------+
FL (|py2stdlib-fl^|) --- Constants used with the fl (|py2stdlib-fl|) module
======================================================
==============================================================================
*py2stdlib-fl^*
FL~
:platform: IRIX
:synopsis: Constants used with the fl module.
:deprecated:
2.6~
The FL (|py2stdlib-fl^|) module has been deprecated for removal in Python 3.0.
This module defines symbolic constants needed to use the built-in module
fl (|py2stdlib-fl|) (see above); they are equivalent to those defined in the C header file
``<forms.h>`` except that the name prefix ``FL_`` is omitted. Read the module
source for a complete list of the defined names. Suggested use:: >
import fl
from FL import *
<
flp (|py2stdlib-flp|) --- Functions for loading stored FORMS designs
==============================================================================
*py2stdlib-flp*
flp~
:platform: IRIX
:synopsis: Functions for loading stored FORMS designs.
:deprecated:
2.6~
The flp (|py2stdlib-flp|) module has been deprecated for removal in Python 3.0.
This module defines functions that can read form definitions created by the
'form designer' (fdesign) program that comes with the FORMS library
(see module fl (|py2stdlib-fl|) above).
For now, see the file flp.doc in the Python library source directory for
a description.
XXX A complete description should be inserted here!
==============================================================================
*py2stdlib-fm*
fm~
:platform: IRIX
:synopsis: Font Manager interface for SGI workstations.
:deprecated:
2.6~
The fm (|py2stdlib-fm|) module has been deprecated for removal in Python 3.0.
.. index::
single: Font Manager, IRIS
single: IRIS Font Manager
This module provides access to the IRIS {Font Manager} library. It is
available only on Silicon Graphics machines. See also: {4Sight User's Guide},
section 1, chapter 5: "Using the IRIS Font Manager."
This is not yet a full interface to the IRIS Font Manager. Among the unsupported
features are: matrix operations; cache operations; character operations (use
string operations instead); some details of font info; individual glyph metrics;
and printer matching.
It supports the following operations:
init()~
Initialization function. Calls fminit. It is normally not necessary to
call this function, since it is called automatically the first time the
fm (|py2stdlib-fm|) module is imported.
findfont(fontname)~
Return a font handle object. Calls ``fmfindfont(fontname)``.
enumerate()~
Returns a list of available font names. This is an interface to
fmenumerate.
prstr(string)~
Render a string using the current font (see the setfont font handle
method below). Calls ``fmprstr(string)``.
setpath(string)~
Sets the font search path. Calls ``fmsetpath(string)``. (XXX Does not work!?!)
fontpath()~
Returns the current font search path.
Font handle objects support the following operations:
font handle.scalefont(factor)~
Returns a handle for a scaled version of this font. Calls ``fmscalefont(fh,
factor)``.
font handle.setfont()~
Makes this font the current font. Note: the effect is undone silently when the
font handle object is deleted. Calls ``fmsetfont(fh)``.
font handle.getfontname()~
Returns this font's name. Calls ``fmgetfontname(fh)``.
font handle.getcomment()~
Returns the comment string associated with this font. Raises an exception if
there is none. Calls ``fmgetcomment(fh)``.
font handle.getfontinfo()~
Returns a tuple giving some pertinent data about this font. This is an interface
to ``fmgetfontinfo()``. The returned tuple contains the following numbers:
``(printermatched, fixed_width, xorig, yorig, xsize, ysize, height, nglyphs)``.
font handle.getstrwidth(string)~
Returns the width, in pixels, of {string} when drawn in this font. Calls
``fmgetstrwidth(fh, string)``.
==============================================================================
*py2stdlib-fnmatch*
fnmatch~
:synopsis: Unix shell style filename pattern matching.
.. index:: single: filenames; wildcard expansion
.. index:: module: re
This module provides support for Unix shell-style wildcards, which are {not} the
same as regular expressions (which are documented in the re (|py2stdlib-re|) module). The
special characters used in shell-style wildcards are:
+------------+------------------------------------+
| Pattern | Meaning |
+============+====================================+
| ``*`` | matches everything |
+------------+------------------------------------+
| ``?`` | matches any single character |
+------------+------------------------------------+
| ``[seq]`` | matches any character in {seq} |
+------------+------------------------------------+
| ``[!seq]`` | matches any character not in {seq} |
+------------+------------------------------------+
.. index:: module: glob
Note that the filename separator (``'/'`` on Unix) is {not} special to this
module. See module glob (|py2stdlib-glob|) for pathname expansion (glob (|py2stdlib-glob|) uses
fnmatch (|py2stdlib-fnmatch|) to match pathname segments). Similarly, filenames starting with
a period are not special for this module, and are matched by the ``*`` and ``?``
patterns.
fnmatch(filename, pattern)~
Test whether the {filename} string matches the {pattern} string, returning
True or False. If the operating system is case-insensitive,
then both parameters will be normalized to all lower- or upper-case before
the comparison is performed. fnmatchcase can be used to perform a
case-sensitive comparison, regardless of whether that's standard for the
operating system.
This example will print all file names in the current directory with the
extension ``.txt``:: >
import fnmatch
import os
for file in os.listdir('.'):
if fnmatch.fnmatch(file, '*.txt'):
print file
<
fnmatchcase(filename, pattern)~
Test whether {filename} matches {pattern}, returning True or
False; the comparison is case-sensitive.
filter(names, pattern)~
Return the subset of the list of {names} that match {pattern}. It is the same as
``[n for n in names if fnmatch(n, pattern)]``, but implemented more efficiently.
.. versionadded:: 2.2
translate(pattern)~
Return the shell-style {pattern} converted to a regular expression.
Example:
>>> import fnmatch, re
>>>
>>> regex = fnmatch.translate('*.txt')
>>> regex
'.*\\.txt$'
>>> reobj = re.compile(regex)
>>> reobj.match('foobar.txt')
<_sre.SRE_Match object at 0x...>
.. seealso::
Module glob (|py2stdlib-glob|)
Unix shell-style path expansion.
==============================================================================
*py2stdlib-formatter*
formatter~
:synopsis: Generic output formatter and device interface.
.. index:: single: HTMLParser (class in htmllib)
This module supports two interface definitions, each with multiple
implementations. The {formatter} interface is used by the HTMLParser (|py2stdlib-htmlparser|)
class of the htmllib (|py2stdlib-htmllib|) module, and the {writer} interface is required by
the formatter interface.
Formatter objects transform an abstract flow of formatting events into specific
output events on writer objects. Formatters manage several stack structures to
allow various properties of a writer object to be changed and restored; writers
need not be able to handle relative changes nor any sort of "change back"
operation. Specific writer properties which may be controlled via formatter
objects are horizontal alignment, font, and left margin indentations. A
mechanism is provided which supports providing arbitrary, non-exclusive style
settings to a writer as well. Additional interfaces facilitate formatting
events which are not reversible, such as paragraph separation.
Writer objects encapsulate device interfaces. Abstract devices, such as file
formats, are supported as well as physical devices. The provided
implementations all work with abstract devices. The interface makes available
mechanisms for setting the properties which formatter objects manage and
inserting data into the output.
The Formatter Interface
-----------------------
Interfaces to create formatters are dependent on the specific formatter class
being instantiated. The interfaces described below are the required interfaces
which all formatters must support once initialized.
One data element is defined at the module level:
AS_IS~
Value which can be used in the font specification passed to the ``push_font()``
method described below, or as the new value to any other ``push_property()``
method. Pushing the ``AS_IS`` value allows the corresponding ``pop_property()``
method to be called without having to track whether the property was changed.
The following attributes are defined for formatter instance objects:
formatter.writer~
The writer instance with which the formatter interacts.
formatter.end_paragraph(blanklines)~
Close any open paragraphs and insert at least {blanklines} before the next
paragraph.
formatter.add_line_break()~
Add a hard line break if one does not already exist. This does not break the
logical paragraph.
formatter.add_hor_rule({args, }*kw)~
Insert a horizontal rule in the output. A hard break is inserted if there is
data in the current paragraph, but the logical paragraph is not broken. The
arguments and keywords are passed on to the writer's send_line_break
method.
formatter.add_flowing_data(data)~
Provide data which should be formatted with collapsed whitespace. Whitespace
from preceding and successive calls to add_flowing_data is considered as
well when the whitespace collapse is performed. The data which is passed to
this method is expected to be word-wrapped by the output device. Note that any
word-wrapping still must be performed by the writer object due to the need to
rely on device and font information.
formatter.add_literal_data(data)~
Provide data which should be passed to the writer unchanged. Whitespace,
including newline and tab characters, are considered legal in the value of
{data}.
formatter.add_label_data(format, counter)~
Insert a label which should be placed to the left of the current left margin.
This should be used for constructing bulleted or numbered lists. If the
{format} value is a string, it is interpreted as a format specification for
{counter}, which should be an integer. The result of this formatting becomes the
value of the label; if {format} is not a string it is used as the label value
directly. The label value is passed as the only argument to the writer's
send_label_data method. Interpretation of non-string label values is
dependent on the associated writer.
Format specifications are strings which, in combination with a counter value,
are used to compute label values. Each character in the format string is copied
to the label value, with some characters recognized to indicate a transform on
the counter value. Specifically, the character ``'1'`` represents the counter
value formatter as an Arabic number, the characters ``'A'`` and ``'a'``
represent alphabetic representations of the counter value in upper and lower
case, respectively, and ``'I'`` and ``'i'`` represent the counter value in Roman
numerals, in upper and lower case. Note that the alphabetic and roman
transforms require that the counter value be greater than zero.
formatter.flush_softspace()~
Send any pending whitespace buffered from a previous call to
add_flowing_data to the associated writer object. This should be called
before any direct manipulation of the writer object.
formatter.push_alignment(align)~
Push a new alignment setting onto the alignment stack. This may be
AS_IS if no change is desired. If the alignment value is changed from
the previous setting, the writer's new_alignment method is called with
the {align} value.
formatter.pop_alignment()~
Restore the previous alignment.
formatter.push_font((size, italic, bold, teletype))~
Change some or all font properties of the writer object. Properties which are
not set to AS_IS are set to the values passed in while others are
maintained at their current settings. The writer's new_font method is
called with the fully resolved font specification.
formatter.pop_font()~
Restore the previous font.
formatter.push_margin(margin)~
Increase the number of left margin indentations by one, associating the logical
tag {margin} with the new indentation. The initial margin level is ``0``.
Changed values of the logical tag must be true values; false values other than
AS_IS are not sufficient to change the margin.
formatter.pop_margin()~
Restore the previous margin.
formatter.push_style(*styles)~
Push any number of arbitrary style specifications. All styles are pushed onto
the styles stack in order. A tuple representing the entire stack, including
AS_IS values, is passed to the writer's new_styles method.
formatter.pop_style([n=1])~
Pop the last {n} style specifications passed to push_style. A tuple
representing the revised stack, including AS_IS values, is passed to
the writer's new_styles method.
formatter.set_spacing(spacing)~
Set the spacing style for the writer.
formatter.assert_line_data([flag=1])~
Inform the formatter that data has been added to the current paragraph
out-of-band. This should be used when the writer has been manipulated
directly. The optional {flag} argument can be set to false if the writer
manipulations produced a hard line break at the end of the output.
Formatter Implementations
-------------------------
Two implementations of formatter objects are provided by this module. Most
applications may use one of these classes without modification or subclassing.
NullFormatter([writer])~
A formatter which does nothing. If {writer} is omitted, a NullWriter
instance is created. No methods of the writer are called by
NullFormatter instances. Implementations should inherit from this
class if implementing a writer interface but don't need to inherit any
implementation.
AbstractFormatter(writer)~
The standard formatter. This implementation has demonstrated wide applicability
to many writers, and may be used directly in most circumstances. It has been
used to implement a full-featured World Wide Web browser.
The Writer Interface
--------------------
Interfaces to create writers are dependent on the specific writer class being
instantiated. The interfaces described below are the required interfaces which
all writers must support once initialized. Note that while most applications can
use the AbstractFormatter class as a formatter, the writer must
typically be provided by the application.
writer.flush()~
Flush any buffered output or device control events.
writer.new_alignment(align)~
Set the alignment style. The {align} value can be any object, but by convention
is a string or ``None``, where ``None`` indicates that the writer's "preferred"
alignment should be used. Conventional {align} values are ``'left'``,
``'center'``, ``'right'``, and ``'justify'``.
writer.new_font(font)~
Set the font style. The value of {font} will be ``None``, indicating that the
device's default font should be used, or a tuple of the form ``(size,
italic, bold, teletype)``. Size will be a string indicating the size of
font that should be used; specific strings and their interpretation must be
defined by the application. The {italic}, {bold}, and {teletype} values are
Boolean values specifying which of those font attributes should be used.
writer.new_margin(margin, level)~
Set the margin level to the integer {level} and the logical tag to {margin}.
Interpretation of the logical tag is at the writer's discretion; the only
restriction on the value of the logical tag is that it not be a false value for
non-zero values of {level}.
writer.new_spacing(spacing)~
Set the spacing style to {spacing}.
writer.new_styles(styles)~
Set additional styles. The {styles} value is a tuple of arbitrary values; the
value AS_IS should be ignored. The {styles} tuple may be interpreted
either as a set or as a stack depending on the requirements of the application
and writer implementation.
writer.send_line_break()~
Break the current line.
writer.send_paragraph(blankline)~
Produce a paragraph separation of at least {blankline} blank lines, or the
equivalent. The {blankline} value will be an integer. Note that the
implementation will receive a call to send_line_break before this call
if a line break is needed; this method should not include ending the last line
of the paragraph. It is only responsible for vertical spacing between
paragraphs.
writer.send_hor_rule({args, }*kw)~
Display a horizontal rule on the output device. The arguments to this method
are entirely application- and writer-specific, and should be interpreted with
care. The method implementation may assume that a line break has already been
issued via send_line_break.
writer.send_flowing_data(data)~
Output character data which may be word-wrapped and re-flowed as needed. Within
any sequence of calls to this method, the writer may assume that spans of
multiple whitespace characters have been collapsed to single space characters.
writer.send_literal_data(data)~
Output character data which has already been formatted for display. Generally,
this should be interpreted to mean that line breaks indicated by newline
characters should be preserved and no new line breaks should be introduced. The
data may contain embedded newline and tab characters, unlike data provided to
the send_formatted_data interface.
writer.send_label_data(data)~
Set {data} to the left of the current left margin, if possible. The value of
{data} is not restricted; treatment of non-string values is entirely
application- and writer-dependent. This method will only be called at the
beginning of a line.
Writer Implementations
----------------------
Three implementations of the writer object interface are provided as examples by
this module. Most applications will need to derive new writer classes from the
NullWriter class.
NullWriter()~
A writer which only provides the interface definition; no actions are taken on
any methods. This should be the base class for all writers which do not need to
inherit any implementation methods.
AbstractWriter()~
A writer which can be used in debugging formatters, but not much else. Each
method simply announces itself by printing its name and arguments on standard
output.
DumbWriter([file[, maxcol=72]])~
Simple writer class which writes output on the file object passed in as {file}
or, if {file} is omitted, on standard output. The output is simply word-wrapped
to the number of columns specified by {maxcol}. This class is suitable for
reflowing a sequence of paragraphs.
==============================================================================
*py2stdlib-fpectl*
fpectl~
:platform: Unix
:synopsis: Provide control for floating point exception handling.
.. note::
The fpectl (|py2stdlib-fpectl|) module is not built by default, and its usage is discouraged
and may be dangerous except in the hands of experts. See also the section
fpectl-limitations on limitations for more details.
.. index:: single: IEEE-754
Most computers carry out floating point operations in conformance with the
so-called IEEE-754 standard. On any real computer, some floating point
operations produce results that cannot be expressed as a normal floating point
value. For example, try :: >
>>> import math
>>> math.exp(1000)
inf
>>> math.exp(1000) / math.exp(1000)
nan
<
(The example above will work on many platforms. DEC Alpha may be one exception.)
"Inf" is a special, non-numeric value in IEEE-754 that stands for "infinity",
and "nan" means "not a number." Note that, other than the non-numeric results,
nothing special happened when you asked Python to carry out those calculations.
That is in fact the default behaviour prescribed in the IEEE-754 standard, and
if it works for you, stop reading now.
In some circumstances, it would be better to raise an exception and stop
processing at the point where the faulty operation was attempted. The
fpectl (|py2stdlib-fpectl|) module is for use in that situation. It provides control over
floating point units from several hardware manufacturers, allowing the user to
turn on the generation of SIGFPE whenever any of the IEEE-754
exceptions Division by Zero, Overflow, or Invalid Operation occurs. In tandem
with a pair of wrapper macros that are inserted into the C code comprising your
python system, SIGFPE is trapped and converted into the Python
FloatingPointError exception.
The fpectl (|py2stdlib-fpectl|) module defines the following functions and may raise the given
exception:
turnon_sigfpe()~
Turn on the generation of SIGFPE, and set up an appropriate signal
handler.
turnoff_sigfpe()~
Reset default handling of floating point exceptions.
FloatingPointError~
After turnon_sigfpe has been executed, a floating point operation that
raises one of the IEEE-754 exceptions Division by Zero, Overflow, or Invalid
operation will in turn raise this standard Python exception.
Example
-------
The following example demonstrates how to start up and test operation of the
fpectl (|py2stdlib-fpectl|) module. :: >
>>> import fpectl
>>> import fpetest
>>> fpectl.turnon_sigfpe()
>>> fpetest.test()
overflow PASS
FloatingPointError: Overflow
div by 0 PASS
FloatingPointError: Division by zero
[ more output from test elided ]
>>> import math
>>> math.exp(1000)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
FloatingPointError: in math_1
<
Limitations and other considerations
Setting up a given processor to trap IEEE-754 floating point errors currently
requires custom code on a per-architecture basis. You may have to modify
fpectl (|py2stdlib-fpectl|) to control your particular hardware.
Conversion of an IEEE-754 exception to a Python exception requires that the
wrapper macros ``PyFPE_START_PROTECT`` and ``PyFPE_END_PROTECT`` be inserted
into your code in an appropriate fashion. Python itself has been modified to
support the fpectl (|py2stdlib-fpectl|) module, but many other codes of interest to numerical
analysts have not.
The fpectl (|py2stdlib-fpectl|) module is not thread-safe.
.. seealso::
Some files in the source distribution may be interesting in learning more about
how this module operates. The include file Include/pyfpe.h discusses the
implementation of this module at some length. Modules/fpetestmodule.c
gives several examples of use. Many additional examples can be found in
Objects/floatobject.c.
==============================================================================
*py2stdlib-fpformat*
fpformat~
:synopsis: General floating point formatting functions.
:deprecated:
2.6~
The fpformat (|py2stdlib-fpformat|) module has been removed in Python 3.0.
The fpformat (|py2stdlib-fpformat|) module defines functions for dealing with floating point
numbers representations in 100% pure Python.
.. note::
This module is unnecessary: everything here can be done using the ``%`` string
interpolation operator described in the string-formatting section.
The fpformat (|py2stdlib-fpformat|) module defines the following functions and an exception:
fix(x, digs)~
Format {x} as ``[-]ddd.ddd`` with {digs} digits after the point and at least one
digit before. If ``digs <= 0``, the decimal point is suppressed.
{x} can be either a number or a string that looks like one. {digs} is an
integer.
Return value is a string.
sci(x, digs)~
Format {x} as ``[-]d.dddE[+-]ddd`` with {digs} digits after the point and
exactly one digit before. If ``digs <= 0``, one digit is kept and the point is
suppressed.
{x} can be either a real number, or a string that looks like one. {digs} is an
integer.
Return value is a string.
NotANumber~
Exception raised when a string passed to fix or sci as the {x}
parameter does not look like a number. This is a subclass of ValueError
when the standard exceptions are strings. The exception value is the improperly
formatted string that caused the exception to be raised.
Example:: >
>>> import fpformat
>>> fpformat.fix(1.23, 1)
'1.2'
==============================================================================
*py2stdlib-fractions*
fractions~
:synopsis: Rational numbers.
.. versionadded:: 2.6
The fractions (|py2stdlib-fractions|) module provides support for rational number arithmetic.
A Fraction instance can be constructed from a pair of integers, from
another rational number, or from a string.
Fraction(numerator=0, denominator=1)~
Fraction(other_fraction)
Fraction(float)
Fraction(decimal)
Fraction(string)
The first version requires that {numerator} and {denominator} are instances
of numbers.Rational and returns a new Fraction instance
with value ``numerator/denominator``. If {denominator} is 0, it
raises a ZeroDivisionError. The second version requires that
{other_fraction} is an instance of numbers.Rational and returns a
Fraction instance with the same value. The next two versions accept
either a float or a decimal.Decimal instance, and return a
Fraction instance with exactly the same value. Note that due to the
usual issues with binary floating-point (see tut-fp-issues), the
argument to ``Fraction(1.1)`` is not exactly equal to 11/10, and so
``Fraction(1.1)`` does {not} return ``Fraction(11, 10)`` as one might expect.
(But see the documentation for the limit_denominator method below.)
The last version of the constructor expects a string or unicode instance.
The usual form for this instance is:: >
[sign] numerator ['/' denominator]
<
where the optional ``sign`` may be either '+' or '-' and
``numerator`` and ``denominator`` (if present) are strings of
decimal digits. In addition, any string that represents a finite
value and is accepted by the float constructor is also
accepted by the Fraction constructor. In either form the
input string may also have leading and/or trailing whitespace.
Here are some examples:: >
>>> from fractions import Fraction
>>> Fraction(16, -10)
Fraction(-8, 5)
>>> Fraction(123)
Fraction(123, 1)
>>> Fraction()
Fraction(0, 1)
>>> Fraction('3/7')
Fraction(3, 7)
[40794 refs]
>>> Fraction(' -3/7 ')
Fraction(-3, 7)
>>> Fraction('1.414213 \t\n')
Fraction(1414213, 1000000)
>>> Fraction('-.125')
Fraction(-1, 8)
>>> Fraction('7e-6')
Fraction(7, 1000000)
>>> Fraction(2.25)
Fraction(9, 4)
>>> Fraction(1.1)
Fraction(2476979795053773, 2251799813685248)
>>> from decimal import Decimal
>>> Fraction(Decimal('1.1'))
Fraction(11, 10)
<
The Fraction class inherits from the abstract base class
numbers.Rational, and implements all of the methods and
operations from that class. Fraction instances are hashable,
and should be treated as immutable. In addition,
Fraction has the following methods:
.. versionchanged:: 2.7
The Fraction constructor now accepts float and
decimal.Decimal instances.
from_float(flt)~
This class method constructs a Fraction representing the exact
value of {flt}, which must be a float. Beware that
``Fraction.from_float(0.3)`` is not the same value as ``Fraction(3, 10)``
.. note:: From Python 2.7 onwards, you can also construct a
Fraction instance directly from a float.
from_decimal(dec)~
This class method constructs a Fraction representing the exact
value of {dec}, which must be a decimal.Decimal.
.. note:: From Python 2.7 onwards, you can also construct a
Fraction instance directly from a decimal.Decimal
instance.
limit_denominator(max_denominator=1000000)~
Finds and returns the closest Fraction to ``self`` that has
denominator at most max_denominator. This method is useful for finding
rational approximations to a given floating-point number:
>>> from fractions import Fraction
>>> Fraction('3.1415926535897932').limit_denominator(1000)
Fraction(355, 113)
or for recovering a rational number that's represented as a float:
>>> from math import pi, cos
>>> Fraction(cos(pi/3))
Fraction(4503599627370497, 9007199254740992)
>>> Fraction(cos(pi/3)).limit_denominator()
Fraction(1, 2)
>>> Fraction(1.1).limit_denominator()
Fraction(11, 10)
gcd(a, b)~
Return the greatest common divisor of the integers {a} and {b}. If either
{a} or {b} is nonzero, then the absolute value of ``gcd(a, b)`` is the
largest integer that divides both {a} and {b}. ``gcd(a,b)`` has the same
sign as {b} if {b} is nonzero; otherwise it takes the sign of {a}. ``gcd(0,
0)`` returns ``0``.
.. seealso::
Module numbers (|py2stdlib-numbers|)
The abstract base classes making up the numeric tower.
==============================================================================
*py2stdlib-framework*
FrameWork~
:platform: Mac
:synopsis: Interactive application framework.
:deprecated:
The FrameWork (|py2stdlib-framework|) module contains classes that together provide a framework
for an interactive Macintosh application. The programmer builds an application
by creating subclasses that override various methods of the bases classes,
thereby implementing the functionality wanted. Overriding functionality can
often be done on various different levels, i.e. to handle clicks in a single
dialog window in a non-standard way it is not necessary to override the complete
event handling.
.. note::
This module has been removed in Python 3.x.
Work on the FrameWork (|py2stdlib-framework|) has pretty much stopped, now that PyObjC is
available for full Cocoa access from Python, and the documentation describes
only the most important functionality, and not in the most logical manner at
that. Examine the source or the examples for more details. The following are
some comments posted on the MacPython newsgroup about the strengths and
limitations of FrameWork (|py2stdlib-framework|):
.. epigraph::
The strong point of FrameWork (|py2stdlib-framework|) is that it allows you to break into the
control-flow at many different places. W (|py2stdlib-w|), for instance, uses a different
way to enable/disable menus and that plugs right in leaving the rest intact.
The weak points of FrameWork (|py2stdlib-framework|) are that it has no abstract command
interface (but that shouldn't be difficult), that its dialog support is minimal
and that its control/toolbar support is non-existent.
The FrameWork (|py2stdlib-framework|) module defines the following functions:
Application()~
An object representing the complete application. See below for a description of
the methods. The default __init__ routine creates an empty window
dictionary and a menu bar with an apple menu.
MenuBar()~
An object representing the menubar. This object is usually not created by the
user.
Menu(bar, title[, after])~
An object representing a menu. Upon creation you pass the ``MenuBar`` the menu
appears in, the {title} string and a position (1-based) {after} where the menu
should appear (default: at the end).
MenuItem(menu, title[, shortcut, callback])~
Create a menu item object. The arguments are the menu to create, the item title
string and optionally the keyboard shortcut and a callback routine. The callback
is called with the arguments menu-id, item number within menu (1-based), current
front window and the event record.
Instead of a callable object the callback can also be a string. In this case
menu selection causes the lookup of a method in the topmost window and the
application. The method name is the callback string with ``'domenu_'``
prepended.
Calling the ``MenuBar`` fixmenudimstate method sets the correct dimming
for all menu items based on the current front window.
Separator(menu)~
Add a separator to the end of a menu.
SubMenu(menu, label)~
Create a submenu named {label} under menu {menu}. The menu object is returned.
Window(parent)~
Creates a (modeless) window. {Parent} is the application object to which the
window belongs. The window is not displayed until later.
DialogWindow(parent)~
Creates a modeless dialog window.
windowbounds(width, height)~
Return a ``(left, top, right, bottom)`` tuple suitable for creation of a window
of given width and height. The window will be staggered with respect to previous
windows, and an attempt is made to keep the whole window on-screen. However, the
window will however always be the exact size given, so parts may be offscreen.
setwatchcursor()~
Set the mouse cursor to a watch.
setarrowcursor()~
Set the mouse cursor to an arrow.
Application Objects
-------------------
Application objects have the following methods, among others:
Application.makeusermenus()~
Override this method if you need menus in your application. Append the menus to
the attribute menubar.
Application.getabouttext()~
Override this method to return a text string describing your application.
Alternatively, override the do_about method for more elaborate "about"
messages.
Application.mainloop([mask[, wait]])~
This routine is the main event loop, call it to set your application rolling.
{Mask} is the mask of events you want to handle, {wait} is the number of ticks
you want to leave to other concurrent application (default 0, which is probably
not a good idea). While raising {self} to exit the mainloop is still supported
it is not recommended: call ``self._quit()`` instead.
The event loop is split into many small parts, each of which can be overridden.
The default methods take care of dispatching events to windows and dialogs,
handling drags and resizes, Apple Events, events for non-FrameWork windows, etc.
In general, all event handlers should return ``1`` if the event is fully handled
and ``0`` otherwise (because the front window was not a FrameWork window, for
instance). This is needed so that update events and such can be passed on to
other windows like the Sioux console window. Calling MacOS.HandleEvent
is not allowed within {our_dispatch} or its callees, since this may result in an
infinite loop if the code is called through the Python inner-loop event handler.
Application.asyncevents(onoff)~
Call this method with a nonzero parameter to enable asynchronous event handling.
This will tell the inner interpreter loop to call the application event handler
{async_dispatch} whenever events are available. This will cause FrameWork window
updates and the user interface to remain working during long computations, but
will slow the interpreter down and may cause surprising results in non-reentrant
code (such as FrameWork itself). By default {async_dispatch} will immediately
call {our_dispatch} but you may override this to handle only certain events
asynchronously. Events you do not handle will be passed to Sioux and such.
The old on/off value is returned.
Application._quit()~
Terminate the running mainloop call at the next convenient moment.
Application.do_char(c, event)~
The user typed character {c}. The complete details of the event can be found in
the {event} structure. This method can also be provided in a ``Window`` object,
which overrides the application-wide handler if the window is frontmost.
Application.do_dialogevent(event)~
Called early in the event loop to handle modeless dialog events. The default
method simply dispatches the event to the relevant dialog (not through the
``DialogWindow`` object involved). Override if you need special handling of
dialog events (keyboard shortcuts, etc).
Application.idle(event)~
Called by the main event loop when no events are available. The null-event is
passed (so you can look at mouse position, etc).
Window Objects
--------------
Window objects have the following methods, among others:
Window.open()~
Override this method to open a window. Store the Mac OS window-id in
self.wid and call the do_postopen method to register the window
with the parent application.
Window.close()~
Override this method to do any special processing on window close. Call the
do_postclose method to cleanup the parent state.
Window.do_postresize(width, height, macoswindowid)~
Called after the window is resized. Override if more needs to be done than
calling ``InvalRect``.
Window.do_contentclick(local, modifiers, event)~
The user clicked in the content part of a window. The arguments are the
coordinates (window-relative), the key modifiers and the raw event.
Window.do_update(macoswindowid, event)~
An update event for the window was received. Redraw the window.
Window.do_activate(activate, event)~
The window was activated (``activate == 1``) or deactivated (``activate == 0``).
Handle things like focus highlighting, etc.
ControlsWindow Object
---------------------
ControlsWindow objects have the following methods besides those of ``Window``
objects:
ControlsWindow.do_controlhit(window, control, pcode, event)~
Part {pcode} of control {control} was hit by the user. Tracking and such has
already been taken care of.
ScrolledWindow Object
---------------------
ScrolledWindow objects are ControlsWindow objects with the following extra
methods:
ScrolledWindow.scrollbars([wantx[, wanty]])~
Create (or destroy) horizontal and vertical scrollbars. The arguments specify
which you want (default: both). The scrollbars always have minimum ``0`` and
maximum ``32767``.
ScrolledWindow.getscrollbarvalues()~
You must supply this method. It should return a tuple ``(x, y)`` giving the
current position of the scrollbars (between ``0`` and ``32767``). You can return
``None`` for either to indicate the whole document is visible in that direction.
ScrolledWindow.updatescrollbars()~
Call this method when the document has changed. It will call
getscrollbarvalues and update the scrollbars.
ScrolledWindow.scrollbar_callback(which, what, value)~
Supplied by you and called after user interaction. {which} will be ``'x'`` or
``'y'``, {what} will be ``'-'``, ``'--'``, ``'set'``, ``'++'`` or ``'+'``. For
``'set'``, {value} will contain the new scrollbar position.
ScrolledWindow.scalebarvalues(absmin, absmax, curmin, curmax)~
Auxiliary method to help you calculate values to return from
getscrollbarvalues. You pass document minimum and maximum value and
topmost (leftmost) and bottommost (rightmost) visible values and it returns the
correct number or ``None``.
ScrolledWindow.do_activate(onoff, event)~
Takes care of dimming/highlighting scrollbars when a window becomes frontmost.
If you override this method, call this one at the end of your method.
ScrolledWindow.do_postresize(width, height, window)~
Moves scrollbars to the correct position. Call this method initially if you
override it.
ScrolledWindow.do_controlhit(window, control, pcode, event)~
Handles scrollbar interaction. If you override it call this method first, a
nonzero return value indicates the hit was in the scrollbars and has been
handled.
DialogWindow Objects
--------------------
DialogWindow objects have the following methods besides those of ``Window``
objects:
DialogWindow.open(resid)~
Create the dialog window, from the DLOG resource with id {resid}. The dialog
object is stored in self.wid.
DialogWindow.do_itemhit(item, event)~
Item number {item} was hit. You are responsible for redrawing toggle buttons,
etc.
==============================================================================
*py2stdlib-ftplib*
ftplib~
:synopsis: FTP protocol client (requires sockets).
.. index::
pair: FTP; protocol
single: FTP; ftplib (standard module)
This module defines the class FTP and a few related items. The
FTP class implements the client side of the FTP protocol. You can use
this to write Python programs that perform a variety of automated FTP jobs, such
as mirroring other ftp servers. It is also used by the module urllib (|py2stdlib-urllib|) to
handle URLs that use FTP. For more information on FTP (File Transfer Protocol),
see Internet 959.
Here's a sample session using the ftplib (|py2stdlib-ftplib|) module:: >
>>> from ftplib import FTP
>>> ftp = FTP('ftp.cwi.nl') # connect to host, default port
>>> ftp.login() # user anonymous, passwd anonymous@
>>> ftp.retrlines('LIST') # list directory contents
total 24418
drwxrwsr-x 5 ftp-usr pdmaint 1536 Mar 20 09:48 .
dr-xr-srwt 105 ftp-usr pdmaint 1536 Mar 21 14:32 ..
-rw-r--r-- 1 ftp-usr pdmaint 5305 Mar 20 09:48 INDEX
.
.
.
>>> ftp.retrbinary('RETR README', open('README', 'wb').write)
'226 Transfer complete.'
>>> ftp.quit()
<
The module defines the following items:
FTP([host[, user[, passwd[, acct[, timeout]]]]])~
Return a new instance of the FTP class. When {host} is given, the
method call ``connect(host)`` is made. When {user} is given, additionally
the method call ``login(user, passwd, acct)`` is made (where {passwd} and
{acct} default to the empty string when not given). The optional {timeout}
parameter specifies a timeout in seconds for blocking operations like the
connection attempt (if is not specified, the global default timeout setting
will be used).
.. versionchanged:: 2.6
{timeout} was added.
FTP_TLS([host[, user[, passwd[, acct[, keyfile[, certfile[, timeout]]]]]]])~
A FTP subclass which adds TLS support to FTP as described in
4217.
Connect as usual to port 21 implicitly securing the FTP control connection
before authenticating. Securing the data connection requires the user to
explicitly ask for it by calling the prot_p method.
{keyfile} and {certfile} are optional -- they can contain a PEM formatted
private key and certificate chain file name for the SSL connection.
.. versionadded:: 2.7
Here's a sample session using the FTP_TLS class:
>>> from ftplib import FTP_TLS
>>> ftps = FTP_TLS('ftp.python.org')
>>> ftps.login() # login anonymously before securing control channel
>>> ftps.prot_p() # switch to secure data connection
>>> ftps.retrlines('LIST') # list directory content securely
total 9
drwxr-xr-x 8 root wheel 1024 Jan 3 1994 .
drwxr-xr-x 8 root wheel 1024 Jan 3 1994 ..
drwxr-xr-x 2 root wheel 1024 Jan 3 1994 bin
drwxr-xr-x 2 root wheel 1024 Jan 3 1994 etc
d-wxrwxr-x 2 ftp wheel 1024 Sep 5 13:43 incoming
drwxr-xr-x 2 root wheel 1024 Nov 17 1993 lib
drwxr-xr-x 6 1094 wheel 1024 Sep 13 19:07 pub
drwxr-xr-x 3 root wheel 1024 Jan 3 1994 usr
-rw-r--r-- 1 root root 312 Aug 1 1994 welcome.msg
'226 Transfer complete.'
>>> ftps.quit()
>>>
error_reply~
Exception raised when an unexpected reply is received from the server.
error_temp~
Exception raised when an error code in the range 400--499 is received.
error_perm~
Exception raised when an error code in the range 500--599 is received.
error_proto~
Exception raised when a reply is received from the server that does not
begin with a digit in the range 1--5.
all_errors~
The set of all exceptions (as a tuple) that methods of FTP
instances may raise as a result of problems with the FTP connection (as
opposed to programming errors made by the caller). This set includes the
four exceptions listed above as well as socket.error and
IOError.
.. seealso::
Module netrc (|py2stdlib-netrc|)
Parser for the .netrc file format. The file .netrc is
typically used by FTP clients to load user authentication information
before prompting the user.
.. index:: single: ftpmirror.py
The file Tools/scripts/ftpmirror.py in the Python source distribution is
a script that can mirror FTP sites, or portions thereof, using the ftplib (|py2stdlib-ftplib|)
module. It can be used as an extended example that applies this module.
FTP Objects
-----------
Several methods are available in two flavors: one for handling text files and
another for binary files. These are named for the command which is used
followed by ``lines`` for the text version or ``binary`` for the binary version.
FTP instances have the following methods:
FTP.set_debuglevel(level)~
Set the instance's debugging level. This controls the amount of debugging
output printed. The default, ``0``, produces no debugging output. A value of
``1`` produces a moderate amount of debugging output, generally a single line
per request. A value of ``2`` or higher produces the maximum amount of
debugging output, logging each line sent and received on the control connection.
FTP.connect(host[, port[, timeout]])~
Connect to the given host and port. The default port number is ``21``, as
specified by the FTP protocol specification. It is rarely needed to specify a
different port number. This function should be called only once for each
instance; it should not be called at all if a host was given when the instance
was created. All other methods can only be used after a connection has been
made.
The optional {timeout} parameter specifies a timeout in seconds for the
connection attempt. If no {timeout} is passed, the global default timeout
setting will be used.
.. versionchanged:: 2.6
{timeout} was added.
FTP.getwelcome()~
Return the welcome message sent by the server in reply to the initial
connection. (This message sometimes contains disclaimers or help information
that may be relevant to the user.)
FTP.login([user[, passwd[, acct]]])~
Log in as the given {user}. The {passwd} and {acct} parameters are optional and
default to the empty string. If no {user} is specified, it defaults to
``'anonymous'``. If {user} is ``'anonymous'``, the default {passwd} is
``'anonymous@'``. This function should be called only once for each instance,
after a connection has been established; it should not be called at all if a
host and user were given when the instance was created. Most FTP commands are
only allowed after the client has logged in. The {acct} parameter supplies
"accounting information"; few systems implement this.
FTP.abort()~
Abort a file transfer that is in progress. Using this does not always work, but
it's worth a try.
FTP.sendcmd(command)~
Send a simple command string to the server and return the response string.
FTP.voidcmd(command)~
Send a simple command string to the server and handle the response. Return
nothing if a response code in the range 200--299 is received. Raise an exception
otherwise.
FTP.retrbinary(command, callback[, maxblocksize[, rest]])~
Retrieve a file in binary transfer mode. {command} should be an appropriate
``RETR`` command: ``'RETR filename'``. The {callback} function is called for
each block of data received, with a single string argument giving the data
block. The optional {maxblocksize} argument specifies the maximum chunk size to
read on the low-level socket object created to do the actual transfer (which
will also be the largest size of the data blocks passed to {callback}). A
reasonable default is chosen. {rest} means the same thing as in the
transfercmd method.
FTP.retrlines(command[, callback])~
Retrieve a file or directory listing in ASCII transfer mode. {command}
should be an appropriate ``RETR`` command (see retrbinary) or a
command such as ``LIST``, ``NLST`` or ``MLSD`` (usually just the string
``'LIST'``). The {callback} function is called for each line, with the
trailing CRLF stripped. The default {callback} prints the line to
``sys.stdout``.
FTP.set_pasv(boolean)~
Enable "passive" mode if {boolean} is true, other disable passive mode. (In
Python 2.0 and before, passive mode was off by default; in Python 2.1 and later,
it is on by default.)
FTP.storbinary(command, file[, blocksize, callback, rest])~
Store a file in binary transfer mode. {command} should be an appropriate
``STOR`` command: ``"STOR filename"``. {file} is an open file object which is
read until EOF using its read method in blocks of size {blocksize} to
provide the data to be stored. The {blocksize} argument defaults to 8192.
{callback} is an optional single parameter callable that is called
on each block of data after it is sent. {rest} means the same thing as in
the transfercmd method.
.. versionchanged:: 2.1
default for {blocksize} added.
.. versionchanged:: 2.6
{callback} parameter added.
.. versionchanged:: 2.7
{rest} parameter added.
FTP.storlines(command, file[, callback])~
Store a file in ASCII transfer mode. {command} should be an appropriate
``STOR`` command (see storbinary). Lines are read until EOF from the
open file object {file} using its readline (|py2stdlib-readline|) method to provide the data to
be stored. {callback} is an optional single parameter callable
that is called on each line after it is sent.
.. versionchanged:: 2.6
{callback} parameter added.
FTP.transfercmd(cmd[, rest])~
Initiate a transfer over the data connection. If the transfer is active, send a
``EPRT`` or ``PORT`` command and the transfer command specified by {cmd}, and
accept the connection. If the server is passive, send a ``EPSV`` or ``PASV``
command, connect to it, and start the transfer command. Either way, return the
socket for the connection.
If optional {rest} is given, a ``REST`` command is sent to the server, passing
{rest} as an argument. {rest} is usually a byte offset into the requested file,
telling the server to restart sending the file's bytes at the requested offset,
skipping over the initial bytes. Note however that RFC 959 requires only that
{rest} be a string containing characters in the printable range from ASCII code
33 to ASCII code 126. The transfercmd method, therefore, converts
{rest} to a string, but no check is performed on the string's contents. If the
server does not recognize the ``REST`` command, an error_reply exception
will be raised. If this happens, simply call transfercmd without a
{rest} argument.
FTP.ntransfercmd(cmd[, rest])~
Like transfercmd, but returns a tuple of the data connection and the
expected size of the data. If the expected size could not be computed, ``None``
will be returned as the expected size. {cmd} and {rest} means the same thing as
in transfercmd.
FTP.nlst(argument[, ...])~
Return a list of files as returned by the ``NLST`` command. The optional
{argument} is a directory to list (default is the current server directory).
Multiple arguments can be used to pass non-standard options to the ``NLST``
command.
FTP.dir(argument[, ...])~
Produce a directory listing as returned by the ``LIST`` command, printing it to
standard output. The optional {argument} is a directory to list (default is the
current server directory). Multiple arguments can be used to pass non-standard
options to the ``LIST`` command. If the last argument is a function, it is used
as a {callback} function as for retrlines; the default prints to
``sys.stdout``. This method returns ``None``.
FTP.rename(fromname, toname)~
Rename file {fromname} on the server to {toname}.
FTP.delete(filename)~
Remove the file named {filename} from the server. If successful, returns the
text of the response, otherwise raises error_perm on permission errors or
error_reply on other errors.
FTP.cwd(pathname)~
Set the current directory on the server.
FTP.mkd(pathname)~
Create a new directory on the server.
FTP.pwd()~
Return the pathname of the current directory on the server.
FTP.rmd(dirname)~
Remove the directory named {dirname} on the server.
FTP.size(filename)~
Request the size of the file named {filename} on the server. On success, the
size of the file is returned as an integer, otherwise ``None`` is returned.
Note that the ``SIZE`` command is not standardized, but is supported by many
common server implementations.
FTP.quit()~
Send a ``QUIT`` command to the server and close the connection. This is the
"polite" way to close a connection, but it may raise an exception if the server
responds with an error to the ``QUIT`` command. This implies a call to the
close method which renders the FTP instance useless for
subsequent calls (see below).
FTP.close()~
Close the connection unilaterally. This should not be applied to an already
closed connection such as after a successful call to quit. After this
call the FTP instance should not be used any more (after a call to
close or quit you cannot reopen the connection by issuing
another login method).
FTP_TLS Objects
---------------
FTP_TLS class inherits from FTP, defining these additional objects:
FTP_TLS.ssl_version~
The SSL version to use (defaults to {TLSv1}).
FTP_TLS.auth()~
Set up secure control connection by using TLS or SSL, depending on what specified in ssl_version attribute.
FTP_TLS.prot_p()~
Set up secure data connection.
FTP_TLS.prot_c()~
Set up clear text data connection.
==============================================================================
*py2stdlib-functools*
functools~
:synopsis: Higher order functions and operations on callable objects.
.. versionadded:: 2.5
The functools (|py2stdlib-functools|) module is for higher-order functions: functions that act on
or return other functions. In general, any callable object can be treated as a
function for the purposes of this module.
The functools (|py2stdlib-functools|) module defines the following functions:
cmp_to_key(func)~
Transform an old-style comparison function to a key-function. Used with
tools that accept key functions (such as sorted, min,
max, heapq.nlargest, heapq.nsmallest,
itertools.groupby).
This function is primarily used as a transition tool for programs
being converted to Py3.x where comparison functions are no longer
supported.
A compare function is any callable that accept two arguments, compares
them, and returns a negative number for less-than, zero for equality,
or a positive number for greater-than. A key function is a callable
that accepts one argument and returns another value that indicates
the position in the desired collation sequence.
Example:: >
sorted(iterable, key=cmp_to_key(locale.strcoll)) # locale-aware sort order
<
.. versionadded:: 2.7
total_ordering(cls)~
Given a class defining one or more rich comparison ordering methods, this
class decorator supplies the rest. This simplifies the effort involved
in specifying all of the possible rich comparison operations:
The class must define one of __lt__, __le__,
__gt__, or __ge__.
In addition, the class should supply an __eq__ method.
For example:: >
@total_ordering
class Student:
def __eq__(self, other):
return ((self.lastname.lower(), self.firstname.lower()) ==
(other.lastname.lower(), other.firstname.lower()))
def __lt__(self, other):
return ((self.lastname.lower(), self.firstname.lower()) <
(other.lastname.lower(), other.firstname.lower()))
<
.. versionadded:: 2.7
reduce(function, iterable[, initializer])~
This is the same function as reduce. It is made available in this module
to allow writing code more forward-compatible with Python 3.
.. versionadded:: 2.6
partial(func[,{args][, }*keywords])~
Return a new partial object which when called will behave like {func}
called with the positional arguments {args} and keyword arguments {keywords}. If
more arguments are supplied to the call, they are appended to {args}. If
additional keyword arguments are supplied, they extend and override {keywords}.
Roughly equivalent to:: >
def partial(func, {args, }*keywords):
def newfunc({fargs, }*fkeywords):
newkeywords = keywords.copy()
newkeywords.update(fkeywords)
return func({(args + fargs), }*newkeywords)
newfunc.func = func
newfunc.args = args
newfunc.keywords = keywords
return newfunc
<
The partial is used for partial function application which "freezes"
some portion of a function's arguments and/or keywords resulting in a new object
with a simplified signature. For example, partial can be used to create
a callable that behaves like the int function where the {base} argument
defaults to two:
>>> from functools import partial
>>> basetwo = partial(int, base=2)
>>> basetwo.__doc__ = 'Convert base 2 string to an int.'
>>> basetwo('10010')
18
update_wrapper(wrapper, wrapped[, assigned][, updated])~
Update a {wrapper} function to look like the {wrapped} function. The optional
arguments are tuples to specify which attributes of the original function are
assigned directly to the matching attributes on the wrapper function and which
attributes of the wrapper function are updated with the corresponding attributes
from the original function. The default values for these arguments are the
module level constants {WRAPPER_ASSIGNMENTS} (which assigns to the wrapper
function's {__name__}, {__module__} and {__doc__}, the documentation string) and
{WRAPPER_UPDATES} (which updates the wrapper function's {__dict__}, i.e. the
instance dictionary).
The main intended use for this function is in decorator functions which
wrap the decorated function and return the wrapper. If the wrapper function is
not updated, the metadata of the returned function will reflect the wrapper
definition rather than the original function definition, which is typically less
than helpful.
wraps(wrapped[, assigned][, updated])~
This is a convenience function for invoking ``partial(update_wrapper,
wrapped=wrapped, assigned=assigned, updated=updated)`` as a function decorator
when defining a wrapper function. For example:
>>> from functools import wraps
>>> def my_decorator(f):
... @wraps(f)
... def wrapper({args, }*kwds):
... print 'Calling decorated function'
... return f({args, }*kwds)
... return wrapper
...
>>> @my_decorator
... def example():
... """Docstring"""
... print 'Called example function'
...
>>> example()
Calling decorated function
Called example function
>>> example.__name__
'example'
>>> example.__doc__
'Docstring'
Without the use of this decorator factory, the name of the example function
would have been ``'wrapper'``, and the docstring of the original example
would have been lost.
partial Objects
------------------------
partial objects are callable objects created by partial. They
have three read-only attributes:
partial.func~
A callable object or function. Calls to the partial object will be
forwarded to func with new arguments and keywords.
partial.args~
The leftmost positional arguments that will be prepended to the positional
arguments provided to a partial object call.
partial.keywords~
The keyword arguments that will be supplied when the partial object is
called.
partial objects are like function objects in that they are
callable, weak referencable, and can have attributes. There are some important
differences. For instance, the __name__ and __doc__ attributes
are not created automatically. Also, partial objects defined in
classes behave like static methods and do not transform into bound methods
during instance attribute look-up.
==============================================================================
*py2stdlib-future_builtins*
future_builtins~
.. versionadded:: 2.6
This module provides functions that exist in 2.x, but have different behavior in
Python 3, so they cannot be put into the 2.x builtins namespace.
Instead, if you want to write code compatible with Python 3 builtins, import
them from this module, like this:: >
from future_builtins import map, filter
... code using Python 3-style map and filter ...
<
The 2to3 tool that ports Python 2 code to Python 3 will recognize
this usage and leave the new builtins alone.
.. note::
The Python 3 print function is already in the builtins, but cannot be
accessed from Python 2 code unless you use the appropriate future statement:: >
from __future__ import print_function
<
Available builtins are:
ascii(object)~
Returns the same as repr (|py2stdlib-repr|). In Python 3, repr (|py2stdlib-repr|) will return
printable Unicode characters unescaped, while ascii will always
backslash-escape them. Using future_builtins.ascii instead of
repr (|py2stdlib-repr|) in 2.6 code makes it clear that you need a pure ASCII return
value.
filter(function, iterable)~
Works like itertools.ifilter.
hex(object)~
Works like the built-in hex, but instead of __hex__ it will
use the __index__ method on its argument to get an integer that is
then converted to hexadecimal.
map(function, iterable, ...)~
Works like itertools.imap.
oct(object)~
Works like the built-in oct, but instead of __oct__ it will
use the __index__ method on its argument to get an integer that is
then converted to octal.
zip(*iterables)~
Works like itertools.izip.
==============================================================================
*py2stdlib-findertools*
findertools~
:platform: Mac
:synopsis: Wrappers around the finder's Apple Events interface.
.. index:: single: AppleEvents
This module contains routines that give Python programs access to some
functionality provided by the finder. They are implemented as wrappers around
the AppleEvent interface to the finder.
All file and folder parameters can be specified either as full pathnames, or as
FSRef or FSSpec objects.
The findertools (|py2stdlib-findertools|) module defines the following functions:
launch(file)~
Tell the finder to launch {file}. What launching means depends on the file:
applications are started, folders are opened and documents are opened in the
correct application.
Print(file)~
Tell the finder to print a file. The behaviour is identical to selecting the
file and using the print command in the finder's file menu.
copy(file, destdir)~
Tell the finder to copy a file or folder {file} to folder {destdir}. The
function returns an Alias object pointing to the new file.
move(file, destdir)~
Tell the finder to move a file or folder {file} to folder {destdir}. The
function returns an Alias object pointing to the new file.
sleep()~
Tell the finder to put the Macintosh to sleep, if your machine supports it.
restart()~
Tell the finder to perform an orderly restart of the machine.
shutdown()~
Tell the finder to perform an orderly shutdown of the machine.
==============================================================================
*py2stdlib-gc*
gc~
:synopsis: Interface to the cycle-detecting garbage collector.
This module provides an interface to the optional garbage collector. It
provides the ability to disable the collector, tune the collection frequency,
and set debugging options. It also provides access to unreachable objects that
the collector found but cannot free. Since the collector supplements the
reference counting already used in Python, you can disable the collector if you
are sure your program does not create reference cycles. Automatic collection
can be disabled by calling ``gc.disable()``. To debug a leaking program call
``gc.set_debug(gc.DEBUG_LEAK)``. Notice that this includes
``gc.DEBUG_SAVEALL``, causing garbage-collected objects to be saved in
gc.garbage for inspection.
The gc (|py2stdlib-gc|) module provides the following functions:
enable()~
Enable automatic garbage collection.
disable()~
Disable automatic garbage collection.
isenabled()~
Returns true if automatic collection is enabled.
collect([generation])~
With no arguments, run a full collection. The optional argument {generation}
may be an integer specifying which generation to collect (from 0 to 2). A
ValueError is raised if the generation number is invalid. The number of
unreachable objects found is returned.
.. versionchanged:: 2.5
The optional {generation} argument was added.
.. versionchanged:: 2.6
The free lists maintained for a number of built-in types are cleared
whenever a full collection or collection of the highest generation (2)
is run. Not all items in some free lists may be freed due to the
particular implementation, in particular int and float.
set_debug(flags)~
Set the garbage collection debugging flags. Debugging information will be
written to ``sys.stderr``. See below for a list of debugging flags which can be
combined using bit operations to control debugging.
get_debug()~
Return the debugging flags currently set.
get_objects()~
Returns a list of all objects tracked by the collector, excluding the list
returned.
.. versionadded:: 2.2
set_threshold(threshold0[, threshold1[, threshold2]])~
Set the garbage collection thresholds (the collection frequency). Setting
{threshold0} to zero disables collection.
The GC classifies objects into three generations depending on how many
collection sweeps they have survived. New objects are placed in the youngest
generation (generation ``0``). If an object survives a collection it is moved
into the next older generation. Since generation ``2`` is the oldest
generation, objects in that generation remain there after a collection. In
order to decide when to run, the collector keeps track of the number object
allocations and deallocations since the last collection. When the number of
allocations minus the number of deallocations exceeds {threshold0}, collection
starts. Initially only generation ``0`` is examined. If generation ``0`` has
been examined more than {threshold1} times since generation ``1`` has been
examined, then generation ``1`` is examined as well. Similarly, {threshold2}
controls the number of collections of generation ``1`` before collecting
generation ``2``.
get_count()~
Return the current collection counts as a tuple of ``(count0, count1,
count2)``.
.. versionadded:: 2.5
get_threshold()~
Return the current collection thresholds as a tuple of ``(threshold0,
threshold1, threshold2)``.
get_referrers(*objs)~
Return the list of objects that directly refer to any of objs. This function
will only locate those containers which support garbage collection; extension
types which do refer to other objects but do not support garbage collection will
not be found.
Note that objects which have already been dereferenced, but which live in cycles
and have not yet been collected by the garbage collector can be listed among the
resulting referrers. To get only currently live objects, call collect
before calling get_referrers.
Care must be taken when using objects returned by get_referrers because
some of them could still be under construction and hence in a temporarily
invalid state. Avoid using get_referrers for any purpose other than
debugging.
.. versionadded:: 2.2
get_referents(*objs)~
Return a list of objects directly referred to by any of the arguments. The
referents returned are those objects visited by the arguments' C-level
tp_traverse methods (if any), and may not be all objects actually
directly reachable. tp_traverse methods are supported only by objects
that support garbage collection, and are only required to visit objects that may
be involved in a cycle. So, for example, if an integer is directly reachable
from an argument, that integer object may or may not appear in the result list.
.. versionadded:: 2.3
is_tracked(obj)~
Returns True if the object is currently tracked by the garbage collector,
False otherwise. As a general rule, instances of atomic types aren't
tracked and instances of non-atomic types (containers, user-defined
objects...) are. However, some type-specific optimizations can be present
in order to suppress the garbage collector footprint of simple instances
(e.g. dicts containing only atomic keys and values):: >
>>> gc.is_tracked(0)
False
>>> gc.is_tracked("a")
False
>>> gc.is_tracked([])
True
>>> gc.is_tracked({})
False
>>> gc.is_tracked({"a": 1})
False
>>> gc.is_tracked({"a": []})
True
<
.. versionadded:: 2.7
The following variable is provided for read-only access (you can mutate its
value but should not rebind it):
garbage~
A list of objects which the collector found to be unreachable but could not be
freed (uncollectable objects). By default, this list contains only objects with
__del__ methods. [#]_ Objects that have __del__ methods and are
part of a reference cycle cause the entire reference cycle to be uncollectable,
including objects not necessarily in the cycle but reachable only from it.
Python doesn't collect such cycles automatically because, in general, it isn't
possible for Python to guess a safe order in which to run the __del__
methods. If you know a safe order, you can force the issue by examining the
{garbage} list, and explicitly breaking cycles due to your objects within the
list. Note that these objects are kept alive even so by virtue of being in the
{garbage} list, so they should be removed from {garbage} too. For example,
after breaking cycles, do ``del gc.garbage[:]`` to empty the list. It's
generally better to avoid the issue by not creating cycles containing objects
with __del__ methods, and {garbage} can be examined in that case to
verify that no such cycles are being created.
If DEBUG_SAVEALL is set, then all unreachable objects will be added to
this list rather than freed.
The following constants are provided for use with set_debug:
DEBUG_STATS~
Print statistics during collection. This information can be useful when tuning
the collection frequency.
DEBUG_COLLECTABLE~
Print information on collectable objects found.
DEBUG_UNCOLLECTABLE~
Print information of uncollectable objects found (objects which are not
reachable but cannot be freed by the collector). These objects will be added to
the ``garbage`` list.
DEBUG_INSTANCES~
When DEBUG_COLLECTABLE or DEBUG_UNCOLLECTABLE is set, print
information about instance objects found.
DEBUG_OBJECTS~
When DEBUG_COLLECTABLE or DEBUG_UNCOLLECTABLE is set, print
information about objects other than instance objects found.
DEBUG_SAVEALL~
When set, all unreachable objects found will be appended to {garbage} rather
than being freed. This can be useful for debugging a leaking program.
DEBUG_LEAK~
The debugging flags necessary for the collector to print information about a
leaking program (equal to ``DEBUG_COLLECTABLE | DEBUG_UNCOLLECTABLE |
DEBUG_INSTANCES | DEBUG_OBJECTS | DEBUG_SAVEALL``).
.. rubric:: Footnotes
.. [#] Prior to Python 2.2, the list contained all instance objects in unreachable
cycles, not only those with __del__ methods.
==============================================================================
*py2stdlib-gdbm*
gdbm~
:platform: Unix
:synopsis: GNU's reinterpretation of dbm.
.. note::
The gdbm (|py2stdlib-gdbm|) module has been renamed to dbm.gnu in Python 3.0. The
2to3 tool will automatically adapt imports when converting your
sources to 3.0.
.. index:: module: dbm
This module is quite similar to the dbm (|py2stdlib-dbm|) module, but uses ``gdbm`` instead
to provide some additional functionality. Please note that the file formats
created by ``gdbm`` and ``dbm`` are incompatible.
The gdbm (|py2stdlib-gdbm|) module provides an interface to the GNU DBM library. ``gdbm``
objects behave like mappings (dictionaries), except that keys and values are
always strings. Printing a ``gdbm`` object doesn't print the keys and values,
and the items and values methods are not supported.
The module defines the following constant and functions:
error~
Raised on ``gdbm``\ -specific errors, such as I/O errors. KeyError is
raised for general mapping errors like specifying an incorrect key.
open(filename, [flag, [mode]])~
Open a ``gdbm`` database and return a ``gdbm`` object. The {filename} argument
is the name of the database file.
The optional {flag} argument can be:
+---------+-------------------------------------------+
| Value | Meaning |
+=========+===========================================+
| ``'r'`` | Open existing database for reading only |
| | (default) |
+---------+-------------------------------------------+
| ``'w'`` | Open existing database for reading and |
| | writing |
+---------+-------------------------------------------+
| ``'c'`` | Open database for reading and writing, |
| | creating it if it doesn't exist |
+---------+-------------------------------------------+
| ``'n'`` | Always create a new, empty database, open |
| | for reading and writing |
+---------+-------------------------------------------+
The following additional characters may be appended to the flag to control
how the database is opened:
+---------+--------------------------------------------+
| Value | Meaning |
+=========+============================================+
| ``'f'`` | Open the database in fast mode. Writes |
| | to the database will not be synchronized. |
+---------+--------------------------------------------+
| ``'s'`` | Synchronized mode. This will cause changes |
| | to the database to be immediately written |
| | to the file. |
+---------+--------------------------------------------+
| ``'u'`` | Do not lock database. |
+---------+--------------------------------------------+
Not all flags are valid for all versions of ``gdbm``. The module constant
open_flags is a string of supported flag characters. The exception
error is raised if an invalid flag is specified.
The optional {mode} argument is the Unix mode of the file, used only when the
database has to be created. It defaults to octal ``0666``.
In addition to the dictionary-like methods, ``gdbm`` objects have the following
methods:
firstkey()~
It's possible to loop over every key in the database using this method and the
nextkey method. The traversal is ordered by ``gdbm``'s internal hash
values, and won't be sorted by the key values. This method returns the starting
key.
nextkey(key)~
Returns the key that follows {key} in the traversal. The following code prints
every key in the database ``db``, without having to create a list in memory that
contains them all:: >
k = db.firstkey()
while k != None:
print k
k = db.nextkey(k)
<
reorganize()~
If you have carried out a lot of deletions and would like to shrink the space
used by the ``gdbm`` file, this routine will reorganize the database. ``gdbm``
will not shorten the length of a database file except by using this
reorganization; otherwise, deleted file space will be kept and reused as new
(key, value) pairs are added.
sync()~
When the database has been opened in fast mode, this method forces any
unwritten data to be written to the disk.
.. seealso::
Module anydbm (|py2stdlib-anydbm|)
Generic interface to ``dbm``\ -style databases.
Module whichdb (|py2stdlib-whichdb|)
Utility module used to determine the type of an existing database.
==============================================================================
*py2stdlib-gensuitemodule*
gensuitemodule~
:platform: Mac
:synopsis: Create a stub package from an OSA dictionary
The gensuitemodule (|py2stdlib-gensuitemodule|) module creates a Python package implementing stub code
for the AppleScript suites that are implemented by a specific application,
according to its AppleScript dictionary.
It is usually invoked by the user through the PythonIDE, but it can
also be run as a script from the command line (pass --help for help on
the options) or imported from Python code. For an example of its use see
Mac/scripts/genallsuites.py in a source distribution, which generates
the stub packages that are included in the standard library.
It defines the following public functions:
is_scriptable(application)~
Returns true if ``application``, which should be passed as a pathname, appears
to be scriptable. Take the return value with a grain of salt: :program:`Internet
Explorer` appears not to be scriptable but definitely is.
processfile(application[, output, basepkgname, edit_modnames, creatorsignature, dump, verbose])~
Create a stub package for ``application``, which should be passed as a full
pathname. For a .app bundle this is the pathname to the bundle, not to
the executable inside the bundle; for an unbundled CFM application you pass the
filename of the application binary.
This function asks the application for its OSA terminology resources, decodes
these resources and uses the resultant data to create the Python code for the
package implementing the client stubs.
``output`` is the pathname where the resulting package is stored, if not
specified a standard "save file as" dialog is presented to the user.
``basepkgname`` is the base package on which this package will build, and
defaults to StdSuites. Only when generating StdSuites itself do
you need to specify this. ``edit_modnames`` is a dictionary that can be used to
change modulenames that are too ugly after name mangling. ``creator_signature``
can be used to override the 4-char creator code, which is normally obtained from
the PkgInfo file in the package or from the CFM file creator signature.
When ``dump`` is given it should refer to a file object, and ``processfile``
will stop after decoding the resources and dump the Python representation of the
terminology resources to this file. ``verbose`` should also be a file object,
and specifying it will cause ``processfile`` to tell you what it is doing.
processfile_fromresource(application[, output, basepkgname, edit_modnames, creatorsignature, dump, verbose])~
This function does the same as ``processfile``, except that it uses a different
method to get the terminology resources. It opens ``application`` as a resource
file and reads all ``"aete"`` and ``"aeut"`` resources from this file.
==============================================================================
*py2stdlib-getopt*
getopt~
:synopsis: Portable parser for command line options; support both short and long option
names.
.. note::
The getopt (|py2stdlib-getopt|) module is a parser for command line options whose API is
designed to be familiar to users of the C getopt (|py2stdlib-getopt|) function. Users who
are unfamiliar with the C getopt (|py2stdlib-getopt|) function or who would like to write
less code and get better help and error messages should consider using the
argparse (|py2stdlib-argparse|) module instead.
This module helps scripts to parse the command line arguments in ``sys.argv``.
It supports the same conventions as the Unix getopt (|py2stdlib-getopt|) function (including
the special meanings of arguments of the form '``-``' and '``--``'). Long
options similar to those supported by GNU software may be used as well via an
optional third argument.
A more convenient, flexible, and powerful alternative is the
optparse (|py2stdlib-optparse|) module.
This module provides two functions and an
exception:
getopt(args, options[, long_options])~
Parses command line options and parameter list. {args} is the argument list to
be parsed, without the leading reference to the running program. Typically, this
means ``sys.argv[1:]``. {options} is the string of option letters that the
script wants to recognize, with options that require an argument followed by a
colon (``':'``; i.e., the same format that Unix getopt (|py2stdlib-getopt|) uses).
.. note:: >
Unlike GNU getopt (|py2stdlib-getopt|), after a non-option argument, all further
arguments are considered also non-options. This is similar to the way
non-GNU Unix systems work.
<
{long_options}, if specified, must be a list of strings with the names of the
long options which should be supported. The leading ``'-``\ ``-'``
characters should not be included in the option name. Long options which
require an argument should be followed by an equal sign (``'='``). Optional
arguments are not supported. To accept only long options, {options} should
be an empty string. Long options on the command line can be recognized so
long as they provide a prefix of the option name that matches exactly one of
the accepted options. For example, if {long_options} is ``['foo', 'frob']``,
the option --fo will match as --foo, but --f
will not match uniquely, so GetoptError will be raised.
The return value consists of two elements: the first is a list of ``(option,
value)`` pairs; the second is the list of program arguments left after the
option list was stripped (this is a trailing slice of {args}). Each
option-and-value pair returned has the option as its first element, prefixed
with a hyphen for short options (e.g., ``'-x'``) or two hyphens for long
options (e.g., ``'-``\ ``-long-option'``), and the option argument as its
second element, or an empty string if the option has no argument. The
options occur in the list in the same order in which they were found, thus
allowing multiple occurrences. Long and short options may be mixed.
gnu_getopt(args, options[, long_options])~
This function works like getopt (|py2stdlib-getopt|), except that GNU style scanning mode is
used by default. This means that option and non-option arguments may be
intermixed. The getopt (|py2stdlib-getopt|) function stops processing options as soon as a
non-option argument is encountered.
If the first character of the option string is '+', or if the environment
variable POSIXLY_CORRECT is set, then option processing stops as
soon as a non-option argument is encountered.
.. versionadded:: 2.3
GetoptError~
This is raised when an unrecognized option is found in the argument list or when
an option requiring an argument is given none. The argument to the exception is
a string indicating the cause of the error. For long options, an argument given
to an option which does not require one will also cause this exception to be
raised. The attributes msg and opt give the error message and
related option; if there is no specific option to which the exception relates,
opt is an empty string.
.. versionchanged:: 1.6
Introduced GetoptError as a synonym for error.
error~
Alias for GetoptError; for backward compatibility.
An example using only Unix style options:
>>> import getopt
>>> args = '-a -b -cfoo -d bar a1 a2'.split()
>>> args
['-a', '-b', '-cfoo', '-d', 'bar', 'a1', 'a2']
>>> optlist, args = getopt.getopt(args, 'abc:d:')
>>> optlist
[('-a', ''), ('-b', ''), ('-c', 'foo'), ('-d', 'bar')]
>>> args
['a1', 'a2']
Using long option names is equally easy:
>>> s = '--condition=foo --testing --output-file abc.def -x a1 a2'
>>> args = s.split()
>>> args
['--condition=foo', '--testing', '--output-file', 'abc.def', '-x', 'a1', 'a2']
>>> optlist, args = getopt.getopt(args, 'x', [
... 'condition=', 'output-file=', 'testing'])
>>> optlist
[('--condition', 'foo'), ('--testing', ''), ('--output-file', 'abc.def'), ('-x', '')]
>>> args
['a1', 'a2']
In a script, typical usage is something like this:: >
import getopt, sys
def main():
try:
opts, args = getopt.getopt(sys.argv[1:], "ho:v", ["help", "output="])
except getopt.GetoptError, err:
# print help information and exit:
print str(err) # will print something like "option -a not recognized"
usage()
sys.exit(2)
output = None
verbose = False
for o, a in opts:
if o == "-v":
verbose = True
elif o in ("-h", "--help"):
usage()
sys.exit()
elif o in ("-o", "--output"):
output = a
else:
assert False, "unhandled option"
# ...
if __name__ == "__main__":
main()
<
Note that an equivalent command line interface could be produced with less code
and more informative help and error messages by using the argparse (|py2stdlib-argparse|) module:: >
import argparse
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('-o', '--output')
parser.add_argument('-v', dest='verbose', action='store_true')
args = parser.parse_args()
# ... do something with args.output ...
# ... do something with args.verbose ..
<
.. seealso::
Module argparse (|py2stdlib-argparse|)
Alternative command line option and argument parsing library.
==============================================================================
*py2stdlib-getpass*
getpass~
:synopsis: Portable reading of passwords and retrieval of the userid.
.. Windows (& Mac?) support by Guido van Rossum.
The getpass (|py2stdlib-getpass|) module provides two functions:
getpass([prompt[, stream]])~
Prompt the user for a password without echoing. The user is prompted using the
string {prompt}, which defaults to ``'Password: '``. On Unix, the prompt is
written to the file-like object {stream}. {stream} defaults to the
controlling terminal (/dev/tty) or if that is unavailable to ``sys.stderr``
(this argument is ignored on Windows).
If echo free input is unavailable getpass() falls back to printing
a warning message to {stream} and reading from ``sys.stdin`` and
issuing a GetPassWarning.
Availability: Macintosh, Unix, Windows.
.. versionchanged:: 2.5
The {stream} parameter was added.
.. versionchanged:: 2.6
On Unix it defaults to using /dev/tty before falling back
to ``sys.stdin`` and ``sys.stderr``.
.. note::
If you call getpass from within IDLE, the input may be done in the
terminal you launched IDLE from rather than the idle window itself.
GetPassWarning~
A UserWarning subclass issued when password input may be echoed.
getuser()~
Return the "login name" of the user. Availability: Unix, Windows.
This function checks the environment variables LOGNAME,
USER, LNAME and USERNAME, in order, and returns
the value of the first one which is set to a non-empty string. If none are set,
the login name from the password database is returned on systems which support
the pwd (|py2stdlib-pwd|) module, otherwise, an exception is raised.
==============================================================================
*py2stdlib-gettext*
gettext~
:synopsis: Multilingual internationalization services.
The gettext (|py2stdlib-gettext|) module provides internationalization (I18N) and localization
(L10N) services for your Python modules and applications. It supports both the
GNU ``gettext`` message catalog API and a higher level, class-based API that may
be more appropriate for Python files. The interface described below allows you
to write your module and application messages in one natural language, and
provide a catalog of translated messages for running under different natural
languages.
Some hints on localizing your Python modules and applications are also given.
GNU gettext (|py2stdlib-gettext|) API
--------------------------
The gettext (|py2stdlib-gettext|) module defines the following API, which is very similar to
the GNU gettext (|py2stdlib-gettext|) API. If you use this API you will affect the
translation of your entire application globally. Often this is what you want if
your application is monolingual, with the choice of language dependent on the
locale of your user. If you are localizing a Python module, or if your
application needs to switch languages on the fly, you probably want to use the
class-based API instead.
bindtextdomain(domain[, localedir])~
Bind the {domain} to the locale directory {localedir}. More concretely,
gettext (|py2stdlib-gettext|) will look for binary .mo files for the given domain using
the path (on Unix): localedir/language/LC_MESSAGES/domain.mo, where
{languages} is searched for in the environment variables LANGUAGE,
LC_ALL, LC_MESSAGES, and LANG respectively.
If {localedir} is omitted or ``None``, then the current binding for {domain} is
returned. [#]_
bind_textdomain_codeset(domain[, codeset])~
Bind the {domain} to {codeset}, changing the encoding of strings returned by the
gettext (|py2stdlib-gettext|) family of functions. If {codeset} is omitted, then the current
binding is returned.
.. versionadded:: 2.4
textdomain([domain])~
Change or query the current global domain. If {domain} is ``None``, then the
current global domain is returned, otherwise the global domain is set to
{domain}, which is returned.
gettext(message)~
Return the localized translation of {message}, based on the current global
domain, language, and locale directory. This function is usually aliased as
_ in the local namespace (see examples below).
lgettext(message)~
Equivalent to gettext (|py2stdlib-gettext|), but the translation is returned in the preferred
system encoding, if no other encoding was explicitly set with
bind_textdomain_codeset.
.. versionadded:: 2.4
dgettext(domain, message)~
Like gettext (|py2stdlib-gettext|), but look the message up in the specified {domain}.
ldgettext(domain, message)~
Equivalent to dgettext, but the translation is returned in the preferred
system encoding, if no other encoding was explicitly set with
bind_textdomain_codeset.
.. versionadded:: 2.4
ngettext(singular, plural, n)~
Like gettext (|py2stdlib-gettext|), but consider plural forms. If a translation is found,
apply the plural formula to {n}, and return the resulting message (some
languages have more than two plural forms). If no translation is found, return
{singular} if {n} is 1; return {plural} otherwise.
The Plural formula is taken from the catalog header. It is a C or Python
expression that has a free variable {n}; the expression evaluates to the index
of the plural in the catalog. See the GNU gettext documentation for the precise
syntax to be used in .po files and the formulas for a variety of
languages.
.. versionadded:: 2.3
lngettext(singular, plural, n)~
Equivalent to ngettext, but the translation is returned in the preferred
system encoding, if no other encoding was explicitly set with
bind_textdomain_codeset.
.. versionadded:: 2.4
dngettext(domain, singular, plural, n)~
Like ngettext, but look the message up in the specified {domain}.
.. versionadded:: 2.3
ldngettext(domain, singular, plural, n)~
Equivalent to dngettext, but the translation is returned in the
preferred system encoding, if no other encoding was explicitly set with
bind_textdomain_codeset.
.. versionadded:: 2.4
Note that GNU gettext (|py2stdlib-gettext|) also defines a dcgettext method, but
this was deemed not useful and so it is currently unimplemented.
Here's an example of typical usage for this API:: >
import gettext
gettext.bindtextdomain('myapplication', '/path/to/my/language/directory')
gettext.textdomain('myapplication')
_ = gettext.gettext
# ...
print _('This is a translatable string.')
<
Class-based API
The class-based API of the gettext (|py2stdlib-gettext|) module gives you more flexibility and
greater convenience than the GNU gettext (|py2stdlib-gettext|) API. It is the recommended
way of localizing your Python applications and modules. gettext (|py2stdlib-gettext|) defines
a "translations" class which implements the parsing of GNU .mo format
files, and has methods for returning either standard 8-bit strings or Unicode
strings. Instances of this "translations" class can also install themselves in
the built-in namespace as the function _.
find(domain[, localedir[, languages[, all]]])~
This function implements the standard .mo file search algorithm. It
takes a {domain}, identical to what textdomain takes. Optional
{localedir} is as in bindtextdomain Optional {languages} is a list of
strings, where each string is a language code.
If {localedir} is not given, then the default system locale directory is used.
[#]_ If {languages} is not given, then the following environment variables are
searched: LANGUAGE, LC_ALL, LC_MESSAGES, and
LANG. The first one returning a non-empty value is used for the
{languages} variable. The environment variables should contain a colon separated
list of languages, which will be split on the colon to produce the expected list
of language code strings.
find then expands and normalizes the languages, and then iterates
through them, searching for an existing file built of these components:
localedir/language/LC_MESSAGES/domain.mo
The first such file name that exists is returned by find. If no such
file is found, then ``None`` is returned. If {all} is given, it returns a list
of all file names, in the order in which they appear in the languages list or
the environment variables.
translation(domain[, localedir[, languages[, class_[, fallback[, codeset]]]]])~
Return a Translations instance based on the {domain}, {localedir}, and
{languages}, which are first passed to find to get a list of the
associated .mo file paths. Instances with identical .mo file
names are cached. The actual class instantiated is either {class_} if provided,
otherwise GNUTranslations. The class's constructor must take a single
file object argument. If provided, {codeset} will change the charset used to
encode translated strings.
If multiple files are found, later files are used as fallbacks for earlier ones.
To allow setting the fallback, copy.copy is used to clone each
translation object from the cache; the actual instance data is still shared with
the cache.
If no .mo file is found, this function raises IOError if
{fallback} is false (which is the default), and returns a
NullTranslations instance if {fallback} is true.
.. versionchanged:: 2.4
Added the {codeset} parameter.
install(domain[, localedir[, unicode [, codeset[, names]]]])~
This installs the function _ in Python's builtins namespace, based on
{domain}, {localedir}, and {codeset} which are passed to the function
translation. The {unicode} flag is passed to the resulting translation
object's NullTranslations.install method.
For the {names} parameter, please see the description of the translation
object's NullTranslations.install method.
As seen below, you usually mark the strings in your application that are
candidates for translation, by wrapping them in a call to the _
function, like this:: >
print _('This string will be translated.')
<
For convenience, you want the _ function to be installed in Python's
builtins namespace, so it is easily accessible in all modules of your
application.
.. versionchanged:: 2.4
Added the {codeset} parameter.
.. versionchanged:: 2.5
Added the {names} parameter.
The NullTranslations class
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Translation classes are what actually implement the translation of original
source file message strings to translated message strings. The base class used
by all translation classes is NullTranslations; this provides the basic
interface you can use to write your own specialized translation classes. Here
are the methods of NullTranslations:
NullTranslations([fp])~
Takes an optional file object {fp}, which is ignored by the base class.
Initializes "protected" instance variables {_info} and {_charset} which are set
by derived classes, as well as {_fallback}, which is set through
add_fallback. It then calls ``self._parse(fp)`` if {fp} is not
``None``.
_parse(fp)~
No-op'd in the base class, this method takes file object {fp}, and reads
the data from the file, initializing its message catalog. If you have an
unsupported message catalog file format, you should override this method
to parse your format.
add_fallback(fallback)~
Add {fallback} as the fallback object for the current translation
object. A translation object should consult the fallback if it cannot provide a
translation for a given message.
gettext(message)~
If a fallback has been set, forward gettext (|py2stdlib-gettext|) to the
fallback. Otherwise, return the translated message. Overridden in derived
classes.
lgettext(message)~
If a fallback has been set, forward lgettext to the
fallback. Otherwise, return the translated message. Overridden in derived
classes.
.. versionadded:: 2.4
ugettext(message)~
If a fallback has been set, forward ugettext to the
fallback. Otherwise, return the translated message as a Unicode
string. Overridden in derived classes.
ngettext(singular, plural, n)~
If a fallback has been set, forward ngettext to the
fallback. Otherwise, return the translated message. Overridden in derived
classes.
.. versionadded:: 2.3
lngettext(singular, plural, n)~
If a fallback has been set, forward ngettext to the
fallback. Otherwise, return the translated message. Overridden in derived
classes.
.. versionadded:: 2.4
ungettext(singular, plural, n)~
If a fallback has been set, forward ungettext to the fallback.
Otherwise, return the translated message as a Unicode string. Overridden
in derived classes.
.. versionadded:: 2.3
info()~
Return the "protected" _info variable.
charset()~
Return the "protected" _charset variable.
output_charset()~
Return the "protected" _output_charset variable, which defines the
encoding used to return translated messages.
.. versionadded:: 2.4
set_output_charset(charset)~
Change the "protected" _output_charset variable, which defines the
encoding used to return translated messages.
.. versionadded:: 2.4
install([unicode [, names]])~
If the {unicode} flag is false, this method installs self.gettext
into the built-in namespace, binding it to ``_``. If {unicode} is true,
it binds self.ugettext instead. By default, {unicode} is false.
If the {names} parameter is given, it must be a sequence containing the
names of functions you want to install in the builtins namespace in
addition to _. Supported names are ``'gettext'`` (bound to
self.gettext or self.ugettext according to the {unicode}
flag), ``'ngettext'`` (bound to self.ngettext or
self.ungettext according to the {unicode} flag), ``'lgettext'``
and ``'lngettext'``.
Note that this is only one way, albeit the most convenient way, to make
the _ function available to your application. Because it affects
the entire application globally, and specifically the built-in namespace,
localized modules should never install _. Instead, they should use
this code to make _ available to their module:: >
import gettext
t = gettext.translation('mymodule', ...)
_ = t.gettext
<
This puts _ only in the module's global namespace and so only
affects calls within this module.
.. versionchanged:: 2.5
Added the {names} parameter.
The GNUTranslations class
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The gettext (|py2stdlib-gettext|) module provides one additional class derived from
NullTranslations: GNUTranslations. This class overrides
_parse to enable reading GNU gettext (|py2stdlib-gettext|) format .mo files
in both big-endian and little-endian format. It also coerces both message ids
and message strings to Unicode.
GNUTranslations parses optional meta-data out of the translation
catalog. It is convention with GNU gettext (|py2stdlib-gettext|) to include meta-data as
the translation for the empty string. This meta-data is in 822\ -style
``key: value`` pairs, and should contain the ``Project-Id-Version`` key. If the
key ``Content-Type`` is found, then the ``charset`` property is used to
initialize the "protected" _charset instance variable, defaulting to
``None`` if not found. If the charset encoding is specified, then all message
ids and message strings read from the catalog are converted to Unicode using
this encoding. The ugettext method always returns a Unicode, while the
gettext (|py2stdlib-gettext|) returns an encoded 8-bit string. For the message id arguments
of both methods, either Unicode strings or 8-bit strings containing only
US-ASCII characters are acceptable. Note that the Unicode version of the
methods (i.e. ugettext and ungettext) are the recommended
interface to use for internationalized Python programs.
The entire set of key/value pairs are placed into a dictionary and set as the
"protected" _info instance variable.
If the .mo file's magic number is invalid, or if other problems occur
while reading the file, instantiating a GNUTranslations class can raise
IOError.
The following methods are overridden from the base class implementation:
GNUTranslations.gettext(message)~
Look up the {message} id in the catalog and return the corresponding message
string, as an 8-bit string encoded with the catalog's charset encoding, if
known. If there is no entry in the catalog for the {message} id, and a fallback
has been set, the look up is forwarded to the fallback's gettext (|py2stdlib-gettext|) method.
Otherwise, the {message} id is returned.
GNUTranslations.lgettext(message)~
Equivalent to gettext (|py2stdlib-gettext|), but the translation is returned in the preferred
system encoding, if no other encoding was explicitly set with
set_output_charset.
.. versionadded:: 2.4
GNUTranslations.ugettext(message)~
Look up the {message} id in the catalog and return the corresponding message
string, as a Unicode string. If there is no entry in the catalog for the
{message} id, and a fallback has been set, the look up is forwarded to the
fallback's ugettext method. Otherwise, the {message} id is returned.
GNUTranslations.ngettext(singular, plural, n)~
Do a plural-forms lookup of a message id. {singular} is used as the message id
for purposes of lookup in the catalog, while {n} is used to determine which
plural form to use. The returned message string is an 8-bit string encoded with
the catalog's charset encoding, if known.
If the message id is not found in the catalog, and a fallback is specified, the
request is forwarded to the fallback's ngettext method. Otherwise, when
{n} is 1 {singular} is returned, and {plural} is returned in all other cases.
.. versionadded:: 2.3
GNUTranslations.lngettext(singular, plural, n)~
Equivalent to gettext (|py2stdlib-gettext|), but the translation is returned in the preferred
system encoding, if no other encoding was explicitly set with
set_output_charset.
.. versionadded:: 2.4
GNUTranslations.ungettext(singular, plural, n)~
Do a plural-forms lookup of a message id. {singular} is used as the message id
for purposes of lookup in the catalog, while {n} is used to determine which
plural form to use. The returned message string is a Unicode string.
If the message id is not found in the catalog, and a fallback is specified, the
request is forwarded to the fallback's ungettext method. Otherwise,
when {n} is 1 {singular} is returned, and {plural} is returned in all other
cases.
Here is an example:: >
n = len(os.listdir('.'))
cat = GNUTranslations(somefile)
message = cat.ungettext(
'There is %(num)d file in this directory',
'There are %(num)d files in this directory',
n) % {'num': n}
<
.. versionadded:: 2.3
Solaris message catalog support
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The Solaris operating system defines its own binary .mo file format, but
since no documentation can be found on this format, it is not supported at this
time.
The Catalog constructor
^^^^^^^^^^^^^^^^^^^^^^^
.. index:: single: GNOME
GNOME uses a version of the gettext (|py2stdlib-gettext|) module by James Henstridge, but this
version has a slightly different API. Its documented usage was:: >
import gettext
cat = gettext.Catalog(domain, localedir)
_ = cat.gettext
print _('hello world')
<
For compatibility with this older module, the function Catalog is an
alias for the translation function described above.
One difference between this module and Henstridge's: his catalog objects
supported access through a mapping API, but this appears to be unused and so is
not currently supported.
Internationalizing your programs and modules
--------------------------------------------
Internationalization (I18N) refers to the operation by which a program is made
aware of multiple languages. Localization (L10N) refers to the adaptation of
your program, once internationalized, to the local language and cultural habits.
In order to provide multilingual messages for your Python programs, you need to
take the following steps:
#. prepare your program or module by specially marking translatable strings
#. run a suite of tools over your marked files to generate raw messages catalogs
#. create language specific translations of the message catalogs
#. use the gettext (|py2stdlib-gettext|) module so that message strings are properly translated
In order to prepare your code for I18N, you need to look at all the strings in
your files. Any string that needs to be translated should be marked by wrapping
it in ``_('...')`` --- that is, a call to the function _. For example:: >
filename = 'mylog.txt'
message = _('writing a log message')
fp = open(filename, 'w')
fp.write(message)
fp.close()
<
In this example, the string ``'writing a log message'`` is marked as a candidate
for translation, while the strings ``'mylog.txt'`` and ``'w'`` are not.
The Python distribution comes with two tools which help you generate the message
catalogs once you've prepared your source code. These may or may not be
available from a binary distribution, but they can be found in a source
distribution, in the Tools/i18n directory.
The pygettext [#]_ program scans all your Python source code looking
for the strings you previously marked as translatable. It is similar to the GNU
gettext (|py2stdlib-gettext|) program except that it understands all the intricacies of
Python source code, but knows nothing about C or C++ source code. You don't
need GNU ``gettext`` unless you're also going to be translating C code (such as
C extension modules).
pygettext generates textual Uniforum-style human readable message
catalog .pot files, essentially structured human readable files which
contain every marked string in the source code, along with a placeholder for the
translation strings. pygettext is a command line script that supports
a similar command line interface as xgettext; for details on its use,
run:: >
pygettext.py --help
<
Copies of these .pot files are then handed over to the individual human
translators who write language-specific versions for every supported natural
language. They send you back the filled in language-specific versions as a
.po file. Using the msgfmt.py [#]_ program (in the
Tools/i18n directory), you take the .po files from your
translators and generate the machine-readable .mo binary catalog files.
The .mo files are what the gettext (|py2stdlib-gettext|) module uses for the actual
translation processing during run-time.
How you use the gettext (|py2stdlib-gettext|) module in your code depends on whether you are
internationalizing a single module or your entire application. The next two
sections will discuss each case.
Localizing your module
^^^^^^^^^^^^^^^^^^^^^^
If you are localizing your module, you must take care not to make global
changes, e.g. to the built-in namespace. You should not use the GNU ``gettext``
API but instead the class-based API.
Let's say your module is called "spam" and the module's various natural language
translation .mo files reside in /usr/share/locale in GNU
gettext (|py2stdlib-gettext|) format. Here's what you would put at the top of your
module:: >
import gettext
t = gettext.translation('spam', '/usr/share/locale')
_ = t.lgettext
<
If your translators were providing you with Unicode strings in their .po
files, you'd instead do:: >
import gettext
t = gettext.translation('spam', '/usr/share/locale')
_ = t.ugettext
<
Localizing your application
If you are localizing your application, you can install the _ function
globally into the built-in namespace, usually in the main driver file of your
application. This will let all your application-specific files just use
``_('...')`` without having to explicitly install it in each file.
In the simple case then, you need only add the following bit of code to the main
driver file of your application:: >
import gettext
gettext.install('myapplication')
<
If you need to set the locale directory or the {unicode} flag, you can pass
these into the install function:: >
import gettext
gettext.install('myapplication', '/usr/share/locale', unicode=1)
<
Changing languages on the fly
If your program needs to support many languages at the same time, you may want
to create multiple translation instances and then switch between them
explicitly, like so:: >
import gettext
lang1 = gettext.translation('myapplication', languages=['en'])
lang2 = gettext.translation('myapplication', languages=['fr'])
lang3 = gettext.translation('myapplication', languages=['de'])
# start by using language1
lang1.install()
# ... time goes by, user selects language 2
lang2.install()
# ... more time goes by, user selects language 3
lang3.install()
<
Deferred translations
In most coding situations, strings are translated where they are coded.
Occasionally however, you need to mark strings for translation, but defer actual
translation until later. A classic example is:: >
animals = ['mollusk',
'albatross',
'rat',
'penguin',
'python', ]
# ...
for a in animals:
print a
<
Here, you want to mark the strings in the ``animals`` list as being
translatable, but you don't actually want to translate them until they are
printed.
Here is one way you can handle this situation:: >
def _(message): return message
animals = [_('mollusk'),
_('albatross'),
_('rat'),
_('penguin'),
_('python'), ]
del _
# ...
for a in animals:
print _(a)
<
This works because the dummy definition of _ simply returns the string
unchanged. And this dummy definition will temporarily override any definition
of _ in the built-in namespace (until the del command). Take
care, though if you have a previous definition of _ in the local
namespace.
Note that the second use of _ will not identify "a" as being
translatable to the pygettext program, since it is not a string.
Another way to handle this is with the following example:: >
def N_(message): return message
animals = [N_('mollusk'),
N_('albatross'),
N_('rat'),
N_('penguin'),
N_('python'), ]
# ...
for a in animals:
print _(a)
<
In this case, you are marking translatable strings with the function N_,
[#]_ which won't conflict with any definition of _. However, you will
need to teach your message extraction program to look for translatable strings
marked with N_. pygettext and xpot both support
this through the use of command line switches.
gettext (|py2stdlib-gettext|) vs. lgettext
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
In Python 2.4 the lgettext family of functions were introduced. The
intention of these functions is to provide an alternative which is more
compliant with the current implementation of GNU gettext. Unlike
gettext (|py2stdlib-gettext|), which returns strings encoded with the same codeset used in the
translation file, lgettext will return strings encoded with the
preferred system encoding, as returned by locale.getpreferredencoding.
Also notice that Python 2.4 introduces new functions to explicitly choose the
codeset used in translated strings. If a codeset is explicitly set, even
lgettext will return translated strings in the requested codeset, as
would be expected in the GNU gettext implementation.
Acknowledgements
----------------
The following people contributed code, feedback, design suggestions, previous
implementations, and valuable experience to the creation of this module:
* Peter Funk
* James Henstridge
* Juan David Ibáñez Palomar
* Marc-André Lemburg
* Martin von Löwis
* François Pinard
* Barry Warsaw
* Gustavo Niemeyer
.. rubric:: Footnotes
.. [#] The default locale directory is system dependent; for example, on RedHat Linux
it is /usr/share/locale, but on Solaris it is /usr/lib/locale.
The gettext (|py2stdlib-gettext|) module does not try to support these system dependent
defaults; instead its default is sys.prefix/share/locale. For this
reason, it is always best to call bindtextdomain with an explicit
absolute path at the start of your application.
.. [#] See the footnote for bindtextdomain above.
.. [#] François Pinard has written a program called xpot which does a
similar job. It is available as part of his po-utils package at http
://po-utils.progiciels-bpi.ca/.
.. [#] msgfmt.py is binary compatible with GNU msgfmt except that
it provides a simpler, all-Python implementation. With this and
pygettext.py, you generally won't need to install the GNU
gettext (|py2stdlib-gettext|) package to internationalize your Python applications.
.. [#] The choice of N_ here is totally arbitrary; it could have just as easily
been MarkThisStringForTranslation.
==============================================================================
*py2stdlib-gl*
gl~
:platform: IRIX
:synopsis: Functions from the Silicon Graphics Graphics Library.
:deprecated:
2.6~
The gl (|py2stdlib-gl|) module has been deprecated for removal in Python 3.0.
This module provides access to the Silicon Graphics {Graphics Library}. It is
available only on Silicon Graphics machines.
.. warning::
Some illegal calls to the GL library cause the Python interpreter to dump
core. In particular, the use of most GL calls is unsafe before the first
window is opened.
The module is too large to document here in its entirety, but the following
should help you to get started. The parameter conventions for the C functions
are translated to Python as follows:
* All (short, long, unsigned) int values are represented by Python integers.
* All float and double values are represented by Python floating point numbers.
In most cases, Python integers are also allowed.
* All arrays are represented by one-dimensional Python lists. In most cases,
tuples are also allowed.
* All string and character arguments are represented by Python strings, for
instance, ``winopen('Hi There!')`` and ``rotate(900, 'z')``.
* All (short, long, unsigned) integer arguments or return values that are only
used to specify the length of an array argument are omitted. For example, the C
call :: >
lmdef(deftype, index, np, props)
is translated to Python as ::
lmdef(deftype, index, props)
<
* Output arguments are omitted from the argument list; they are transmitted as
function return values instead. If more than one value must be returned, the
return value is a tuple. If the C function has both a regular return value (that
is not omitted because of the previous rule) and an output argument, the return
value comes first in the tuple. Examples: the C call :: >
getmcolor(i, &red, &green, &blue)
is translated to Python as ::
red, green, blue = getmcolor(i)
<
The following functions are non-standard or have special argument conventions:
varray(argument)~
Equivalent to but faster than a number of ``v3d()`` calls. The {argument} is a
list (or tuple) of points. Each point must be a tuple of coordinates ``(x, y,
z)`` or ``(x, y)``. The points may be 2- or 3-dimensional but must all have the
same dimension. Float and int values may be mixed however. The points are always
converted to 3D double precision points by assuming ``z = 0.0`` if necessary (as
indicated in the man page), and for each point ``v3d()`` is called.
.. XXX the argument-argument added
nvarray()~
Equivalent to but faster than a number of ``n3f`` and ``v3f`` calls. The
argument is an array (list or tuple) of pairs of normals and points. Each pair
is a tuple of a point and a normal for that point. Each point or normal must be
a tuple of coordinates ``(x, y, z)``. Three coordinates must be given. Float and
int values may be mixed. For each pair, ``n3f()`` is called for the normal, and
then ``v3f()`` is called for the point.
vnarray()~
Similar to ``nvarray()`` but the pairs have the point first and the normal
second.
nurbssurface(s_k, t_k, ctl, s_ord, t_ord, type)~
Defines a nurbs surface. The dimensions of ``ctl[][]`` are computed as follows:
``[len(s_k) - s_ord]``, ``[len(t_k) - t_ord]``.
.. XXX s_k[], t_k[], ctl[][]
nurbscurve(knots, ctlpoints, order, type)~
Defines a nurbs curve. The length of ctlpoints is ``len(knots) - order``.
pwlcurve(points, type)~
Defines a piecewise-linear curve. {points} is a list of points. {type} must be
``N_ST``.
pick(n)~
select(n)
The only argument to these functions specifies the desired size of the pick or
select buffer.
endpick()~
endselect()
These functions have no arguments. They return a list of integers representing
the used part of the pick/select buffer. No method is provided to detect buffer
overrun.
Here is a tiny but complete example GL program in Python:: >
import gl, GL, time
def main():
gl.foreground()
gl.prefposition(500, 900, 500, 900)
w = gl.winopen('CrissCross')
gl.ortho2(0.0, 400.0, 0.0, 400.0)
gl.color(GL.WHITE)
gl.clear()
gl.color(GL.RED)
gl.bgnline()
gl.v2f(0.0, 0.0)
gl.v2f(400.0, 400.0)
gl.endline()
gl.bgnline()
gl.v2f(400.0, 0.0)
gl.v2f(0.0, 400.0)
gl.endline()
time.sleep(5)
main()
<
.. seealso::
`PyOpenGL: The Python OpenGL Binding <http://pyopengl.sourceforge.net/>`_
.. index::
single: OpenGL
single: PyOpenGL
An interface to OpenGL is also available; see information about the {PyOpenGL}*
project online at http://pyopengl.sourceforge.net/. This may be a better option
if support for SGI hardware from before about 1996 is not required.
DEVICE (|py2stdlib-device|) --- Constants used with the gl (|py2stdlib-gl|) module
==========================================================
==============================================================================
*py2stdlib-gl^*
GL~
:platform: IRIX
:synopsis: Constants used with the gl module.
:deprecated:
2.6~
The GL (|py2stdlib-gl^|) module has been deprecated for removal in Python 3.0.
This module contains constants used by the Silicon Graphics {Graphics Library}
from the C header file ``<gl/gl.h>``. Read the module source file for details.
==============================================================================
*py2stdlib-glob*
glob~
:synopsis: Unix shell style pathname pattern expansion.
.. index:: single: filenames; pathname expansion
The glob (|py2stdlib-glob|) module finds all the pathnames matching a specified pattern
according to the rules used by the Unix shell. No tilde expansion is done, but
``*``, ``?``, and character ranges expressed with ``[]`` will be correctly
matched. This is done by using the os.listdir and
fnmatch.fnmatch functions in concert, and not by actually invoking a
subshell. (For tilde and shell variable expansion, use
os.path.expanduser and os.path.expandvars.)
glob(pathname)~
Return a possibly-empty list of path names that match {pathname}, which must be
a string containing a path specification. {pathname} can be either absolute
(like /usr/src/Python-1.5/Makefile) or relative (like
../../Tools/\{/\}.gif), and can contain shell-style wildcards. Broken
symlinks are included in the results (as in the shell).
iglob(pathname)~
Return an iterator which yields the same values as glob (|py2stdlib-glob|)
without actually storing them all simultaneously.
.. versionadded:: 2.5
For example, consider a directory containing only the following files:
1.gif, 2.txt, and card.gif. glob (|py2stdlib-glob|) will produce
the following results. Notice how any leading components of the path are
preserved. :: >
>>> import glob
>>> glob.glob('./[0-9].*')
['./1.gif', './2.txt']
>>> glob.glob('*.gif')
['1.gif', 'card.gif']
>>> glob.glob('?.gif')
['1.gif']
<
.. seealso::
Module fnmatch (|py2stdlib-fnmatch|)
Shell-style filename (not path) expansion
==============================================================================
*py2stdlib-grp*
grp~
:platform: Unix
:synopsis: The group database (getgrnam() and friends).
This module provides access to the Unix group database. It is available on all
Unix versions.
Group database entries are reported as a tuple-like object, whose attributes
correspond to the members of the ``group`` structure (Attribute field below, see
``<pwd.h>``):
+-------+-----------+---------------------------------+
| Index | Attribute | Meaning |
+=======+===========+=================================+
| 0 | gr_name | the name of the group |
+-------+-----------+---------------------------------+
| 1 | gr_passwd | the (encrypted) group password; |
| | | often empty |
+-------+-----------+---------------------------------+
| 2 | gr_gid | the numerical group ID |
+-------+-----------+---------------------------------+
| 3 | gr_mem | all the group member's user |
| | | names |
+-------+-----------+---------------------------------+
The gid is an integer, name and password are strings, and the member list is a
list of strings. (Note that most users are not explicitly listed as members of
the group they are in according to the password database. Check both databases
to get complete membership information.)
It defines the following items:
getgrgid(gid)~
Return the group database entry for the given numeric group ID. KeyError
is raised if the entry asked for cannot be found.
getgrnam(name)~
Return the group database entry for the given group name. KeyError is
raised if the entry asked for cannot be found.
getgrall()~
Return a list of all available group entries, in arbitrary order.
.. seealso::
Module pwd (|py2stdlib-pwd|)
An interface to the user database, similar to this.
Module spwd (|py2stdlib-spwd|)
An interface to the shadow password database, similar to this.
==============================================================================
*py2stdlib-gzip*
gzip~
:synopsis: Interfaces for gzip compression and decompression using file objects.
This module provides a simple interface to compress and decompress files just
like the GNU programs gzip (|py2stdlib-gzip|) and gunzip would.
The data compression is provided by the zlib (|py2stdlib-zlib|) module.
The gzip (|py2stdlib-gzip|) module provides the GzipFile class which is modeled
after Python's File Object. The GzipFile class reads and writes
gzip (|py2stdlib-gzip|)\ -format files, automatically compressing or decompressing the
data so that it looks like an ordinary file object.
Note that additional file formats which can be decompressed by the
gzip (|py2stdlib-gzip|) and gunzip programs, such as those produced by
compress and pack, are not supported by this module.
For other archive formats, see the bz2 (|py2stdlib-bz2|), zipfile (|py2stdlib-zipfile|), and
tarfile (|py2stdlib-tarfile|) modules.
The module defines the following items:
GzipFile([filename[, mode[, compresslevel[, fileobj[, mtime]]]]])~
Constructor for the GzipFile class, which simulates most of the methods
of a file object, with the exception of the readinto and
truncate methods. At least one of {fileobj} and {filename} must be
given a non-trivial value.
The new class instance is based on {fileobj}, which can be a regular file, a
StringIO (|py2stdlib-stringio|) object, or any other object which simulates a file. It
defaults to ``None``, in which case {filename} is opened to provide a file
object.
When {fileobj} is not ``None``, the {filename} argument is only used to be
included in the gzip (|py2stdlib-gzip|) file header, which may includes the original
filename of the uncompressed file. It defaults to the filename of {fileobj}, if
discernible; otherwise, it defaults to the empty string, and in this case the
original filename is not included in the header.
The {mode} argument can be any of ``'r'``, ``'rb'``, ``'a'``, ``'ab'``, ``'w'``,
or ``'wb'``, depending on whether the file will be read or written. The default
is the mode of {fileobj} if discernible; otherwise, the default is ``'rb'``. If
not given, the 'b' flag will be added to the mode to ensure the file is opened
in binary mode for cross-platform portability.
The {compresslevel} argument is an integer from ``1`` to ``9`` controlling the
level of compression; ``1`` is fastest and produces the least compression, and
``9`` is slowest and produces the most compression. The default is ``9``.
The {mtime} argument is an optional numeric timestamp to be written to
the stream when compressing. All gzip (|py2stdlib-gzip|) compressed streams are
required to contain a timestamp. If omitted or ``None``, the current
time is used. This module ignores the timestamp when decompressing;
however, some programs, such as gunzip\ , make use of it.
The format of the timestamp is the same as that of the return value of
``time.time()`` and of the ``st_mtime`` member of the object returned
by ``os.stat()``.
Calling a GzipFile object's close method does not close
{fileobj}, since you might wish to append more material after the compressed
data. This also allows you to pass a StringIO (|py2stdlib-stringio|) object opened for
writing as {fileobj}, and retrieve the resulting memory buffer using the
StringIO (|py2stdlib-stringio|) object's getvalue method.
GzipFile supports iteration and the with statement.
.. versionchanged:: 2.7
Support for the with statement was added.
.. versionchanged:: 2.7
Support for zero-padded files was added.
open(filename[, mode[, compresslevel]])~
This is a shorthand for ``GzipFile(filename,`` ``mode,`` ``compresslevel)``.
The {filename} argument is required; {mode} defaults to ``'rb'`` and
{compresslevel} defaults to ``9``.
Examples of usage
-----------------
Example of how to read a compressed file:: >
import gzip
f = gzip.open('/home/joe/file.txt.gz', 'rb')
file_content = f.read()
f.close()
<
Example of how to create a compressed GZIP file::
import gzip
content = "Lots of content here"
f = gzip.open('/home/joe/file.txt.gz', 'wb')
f.write(content)
f.close()
Example of how to GZIP compress an existing file:: >
import gzip
f_in = open('/home/joe/file.txt', 'rb')
f_out = gzip.open('/home/joe/file.txt.gz', 'wb')
f_out.writelines(f_in)
f_out.close()
f_in.close()
<
.. seealso::
Module zlib (|py2stdlib-zlib|)
The basic data compression module needed to support the gzip (|py2stdlib-gzip|) file
format.
==============================================================================
*py2stdlib-hashlib*
hashlib~
:synopsis: Secure hash and message digest algorithms.
.. versionadded:: 2.5
.. index::
single: message digest, MD5
single: secure hash algorithm, SHA1, SHA224, SHA256, SHA384, SHA512
This module implements a common interface to many different secure hash and
message digest algorithms. Included are the FIPS secure hash algorithms SHA1,
SHA224, SHA256, SHA384, and SHA512 (defined in FIPS 180-2) as well as RSA's MD5
algorithm (defined in Internet 1321). The terms secure hash and message
digest are interchangeable. Older algorithms were called message digests. The
modern term is secure hash.
.. note::
If you want the adler32 or crc32 hash functions they are available in
the zlib (|py2stdlib-zlib|) module.
.. warning::
Some algorithms have known hash collision weaknesses, see the FAQ at the end.
There is one constructor method named for each type of hash. All return
a hash object with the same simple interface. For example: use sha1 to
create a SHA1 hash object. You can now feed this object with arbitrary strings
using the update method. At any point you can ask it for the
digest of the concatenation of the strings fed to it so far using the
digest or hexdigest methods.
.. index:: single: OpenSSL; (use in module hashlib)
Constructors for hash algorithms that are always present in this module are
md5 (|py2stdlib-md5|), sha1, sha224, sha256, sha384, and
sha512. Additional algorithms may also be available depending upon the
OpenSSL library that Python uses on your platform.
For example, to obtain the digest of the string ``'Nobody inspects the spammish
repetition'``:
>>> import hashlib
>>> m = hashlib.md5()
>>> m.update("Nobody inspects")
>>> m.update(" the spammish repetition")
>>> m.digest()
'\xbbd\x9c\x83\xdd\x1e\xa5\xc9\xd9\xde\xc9\xa1\x8d\xf0\xff\xe9'
>>> m.digest_size
16
>>> m.block_size
64
More condensed:
>>> hashlib.sha224("Nobody inspects the spammish repetition").hexdigest()
'a4337bc45a8fc544c03f52dc550cd6e1e87021bc896588bd79e901e2'
A generic new (|py2stdlib-new|) constructor that takes the string name of the desired
algorithm as its first parameter also exists to allow access to the above listed
hashes as well as any other algorithms that your OpenSSL library may offer. The
named constructors are much faster than new (|py2stdlib-new|) and should be preferred.
Using new (|py2stdlib-new|) with an algorithm provided by OpenSSL:
>>> h = hashlib.new('ripemd160')
>>> h.update("Nobody inspects the spammish repetition")
>>> h.hexdigest()
'cc4a5ce1b3df48aec5d22d1f16b894a0b894eccc'
This module provides the following constant attribute:
hashlib.algorithms~
A tuple providing the names of the hash algorithms guaranteed to be
supported by this module.
.. versionadded:: 2.7
The following values are provided as constant attributes of the hash objects
returned by the constructors:
hash.digest_size~
The size of the resulting hash in bytes.
hash.block_size~
The internal block size of the hash algorithm in bytes.
A hash object has the following methods:
hash.update(arg)~
Update the hash object with the string {arg}. Repeated calls are equivalent to
a single call with the concatenation of all the arguments: ``m.update(a);
m.update(b)`` is equivalent to ``m.update(a+b)``.
.. versionchanged:: 2.7
The Python GIL is released to allow other threads to run while
hash updates on data larger than 2048 bytes is taking place when
using hash algorithms supplied by OpenSSL.
hash.digest()~
Return the digest of the strings passed to the update method so far.
This is a string of digest_size bytes which may contain non-ASCII
characters, including null bytes.
hash.hexdigest()~
Like digest except the digest is returned as a string of double length,
containing only hexadecimal digits. This may be used to exchange the value
safely in email or other non-binary environments.
hash.copy()~
Return a copy ("clone") of the hash object. This can be used to efficiently
compute the digests of strings that share a common initial substring.
.. seealso::
Module hmac (|py2stdlib-hmac|)
A module to generate message authentication codes using hashes.
Module base64 (|py2stdlib-base64|)
Another way to encode binary hashes for non-binary environments.
http://csrc.nist.gov/publications/fips/fips180-2/fips180-2.pdf
The FIPS 180-2 publication on Secure Hash Algorithms.
http://en.wikipedia.org/wiki/Cryptographic_hash_function#Cryptographic_hash_algorithms
Wikipedia article with information on which algorithms have known issues and
what that means regarding their use.
==============================================================================
*py2stdlib-heapq*
heapq~
:synopsis: Heap queue algorithm (a.k.a. priority queue).
.. versionadded:: 2.3
This module provides an implementation of the heap queue algorithm, also known
as the priority queue algorithm.
Heaps are arrays for which ``heap[k] <= heap[2*k+1]`` and ``heap[k] <=
heap[2{k+2]`` for all }k*, counting elements from zero. For the sake of
comparison, non-existing elements are considered to be infinite. The
interesting property of a heap is that ``heap[0]`` is always its smallest
element.
The API below differs from textbook heap algorithms in two aspects: (a) We use
zero-based indexing. This makes the relationship between the index for a node
and the indexes for its children slightly less obvious, but is more suitable
since Python uses zero-based indexing. (b) Our pop method returns the smallest
item, not the largest (called a "min heap" in textbooks; a "max heap" is more
common in texts because of its suitability for in-place sorting).
These two make it possible to view the heap as a regular Python list without
surprises: ``heap[0]`` is the smallest item, and ``heap.sort()`` maintains the
heap invariant!
To create a heap, use a list initialized to ``[]``, or you can transform a
populated list into a heap via function heapify.
The following functions are provided:
heappush(heap, item)~
Push the value {item} onto the {heap}, maintaining the heap invariant.
heappop(heap)~
Pop and return the smallest item from the {heap}, maintaining the heap
invariant. If the heap is empty, IndexError is raised.
heappushpop(heap, item)~
Push {item} on the heap, then pop and return the smallest item from the
{heap}. The combined action runs more efficiently than heappush
followed by a separate call to heappop.
.. versionadded:: 2.6
heapify(x)~
Transform list {x} into a heap, in-place, in linear time.
heapreplace(heap, item)~
Pop and return the smallest item from the {heap}, and also push the new {item}.
The heap size doesn't change. If the heap is empty, IndexError is raised.
This is more efficient than heappop followed by heappush, and
can be more appropriate when using a fixed-size heap. Note that the value
returned may be larger than {item}! That constrains reasonable uses of this
routine unless written as part of a conditional replacement:: >
if item > heap[0]:
item = heapreplace(heap, item)
<
Example of use:
>>> from heapq import heappush, heappop
>>> heap = []
>>> data = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0]
>>> for item in data:
... heappush(heap, item)
...
>>> ordered = []
>>> while heap:
... ordered.append(heappop(heap))
...
>>> print ordered
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> data.sort()
>>> print data == ordered
True
Using a heap to insert items at the correct place in a priority queue:
>>> heap = []
>>> data = [(1, 'J'), (4, 'N'), (3, 'H'), (2, 'O')]
>>> for item in data:
... heappush(heap, item)
...
>>> while heap:
... print heappop(heap)[1]
J
O
H
N
The module also offers three general purpose functions based on heaps.
merge(*iterables)~
Merge multiple sorted inputs into a single sorted output (for example, merge
timestamped entries from multiple log files). Returns an iterator
over the sorted values.
Similar to ``sorted(itertools.chain(*iterables))`` but returns an iterable, does
not pull the data into memory all at once, and assumes that each of the input
streams is already sorted (smallest to largest).
.. versionadded:: 2.6
nlargest(n, iterable[, key])~
Return a list with the {n} largest elements from the dataset defined by
{iterable}. {key}, if provided, specifies a function of one argument that is
used to extract a comparison key from each element in the iterable:
``key=str.lower`` Equivalent to: ``sorted(iterable, key=key,
reverse=True)[:n]``
.. versionadded:: 2.4
.. versionchanged:: 2.5
Added the optional {key} argument.
nsmallest(n, iterable[, key])~
Return a list with the {n} smallest elements from the dataset defined by
{iterable}. {key}, if provided, specifies a function of one argument that is
used to extract a comparison key from each element in the iterable:
``key=str.lower`` Equivalent to: ``sorted(iterable, key=key)[:n]``
.. versionadded:: 2.4
.. versionchanged:: 2.5
Added the optional {key} argument.
The latter two functions perform best for smaller values of {n}. For larger
values, it is more efficient to use the sorted function. Also, when
``n==1``, it is more efficient to use the built-in min and max
functions.
Theory
------
(This explanation is due to François Pinard. The Python code for this module
was contributed by Kevin O'Connor.)
Heaps are arrays for which ``a[k] <= a[2{k+1]`` and ``a[k] <= a[2}k+2]`` for all
{k}, counting elements from 0. For the sake of comparison, non-existing
elements are considered to be infinite. The interesting property of a heap is
that ``a[0]`` is always its smallest element.
The strange invariant above is meant to be an efficient memory representation
for a tournament. The numbers below are {k}, not ``a[k]``:: >
0
1 2
3 4 5 6
7 8 9 10 11 12 13 14
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
<
In the tree above, each cell {k} is topping ``2{k+1`` and ``2}k+2``. In an usual
binary tournament we see in sports, each cell is the winner over the two cells
it tops, and we can trace the winner down the tree to see all opponents s/he
had. However, in many computer applications of such tournaments, we do not need
to trace the history of a winner. To be more memory efficient, when a winner is
promoted, we try to replace it by something else at a lower level, and the rule
becomes that a cell and the two cells it tops contain three different items, but
the top cell "wins" over the two topped cells.
If this heap invariant is protected at all time, index 0 is clearly the overall
winner. The simplest algorithmic way to remove it and find the "next" winner is
to move some loser (let's say cell 30 in the diagram above) into the 0 position,
and then percolate this new 0 down the tree, exchanging values, until the
invariant is re-established. This is clearly logarithmic on the total number of
items in the tree. By iterating over all items, you get an O(n log n) sort.
A nice feature of this sort is that you can efficiently insert new items while
the sort is going on, provided that the inserted items are not "better" than the
last 0'th element you extracted. This is especially useful in simulation
contexts, where the tree holds all incoming events, and the "win" condition
means the smallest scheduled time. When an event schedule other events for
execution, they are scheduled into the future, so they can easily go into the
heap. So, a heap is a good structure for implementing schedulers (this is what
I used for my MIDI sequencer :-).
Various structures for implementing schedulers have been extensively studied,
and heaps are good for this, as they are reasonably speedy, the speed is almost
constant, and the worst case is not much different than the average case.
However, there are other representations which are more efficient overall, yet
the worst cases might be terrible.
Heaps are also very useful in big disk sorts. You most probably all know that a
big sort implies producing "runs" (which are pre-sorted sequences, which size is
usually related to the amount of CPU memory), followed by a merging passes for
these runs, which merging is often very cleverly organised [#]_. It is very
important that the initial sort produces the longest runs possible. Tournaments
are a good way to that. If, using all the memory available to hold a
tournament, you replace and percolate items that happen to fit the current run,
you'll produce runs which are twice the size of the memory for random input, and
much better for input fuzzily ordered.
Moreover, if you output the 0'th item on disk and get an input which may not fit
in the current tournament (because the value "wins" over the last output value),
it cannot fit in the heap, so the size of the heap decreases. The freed memory
could be cleverly reused immediately for progressively building a second heap,
which grows at exactly the same rate the first heap is melting. When the first
heap completely vanishes, you switch heaps and start a new run. Clever and
quite effective!
In a word, heaps are useful memory structures to know. I use them in a few
applications, and I think it is good to keep a 'heap' module around. :-)
.. rubric:: Footnotes
.. [#] The disk balancing algorithms which are current, nowadays, are more annoying
than clever, and this is a consequence of the seeking capabilities of the disks.
On devices which cannot seek, like big tape drives, the story was quite
different, and one had to be very clever to ensure (far in advance) that each
tape movement will be the most effective possible (that is, will best
participate at "progressing" the merge). Some tapes were even able to read
backwards, and this was also used to avoid the rewinding time. Believe me, real
good tape sorts were quite spectacular to watch! From all times, sorting has
always been a Great Art! :-)
==============================================================================
*py2stdlib-hmac*
hmac~
:synopsis: Keyed-Hashing for Message Authentication (HMAC) implementation for Python.
.. versionadded:: 2.2
This module implements the HMAC algorithm as described by 2104.
new(key[, msg[, digestmod]])~
Return a new hmac object. If {msg} is present, the method call ``update(msg)``
is made. {digestmod} is the digest constructor or module for the HMAC object to
use. It defaults to the hashlib.md5 constructor.
.. note:: >
The md5 hash has known weaknesses but remains the default for backwards
compatibility. Choose a better one for your application.
<
An HMAC object has the following methods:
hmac.update(msg)~
Update the hmac object with the string {msg}. Repeated calls are equivalent to
a single call with the concatenation of all the arguments: ``m.update(a);
m.update(b)`` is equivalent to ``m.update(a + b)``.
hmac.digest()~
Return the digest of the strings passed to the update method so far.
This string will be the same length as the {digest_size} of the digest given to
the constructor. It may contain non-ASCII characters, including NUL bytes.
hmac.hexdigest()~
Like digest except the digest is returned as a string twice the length
containing only hexadecimal digits. This may be used to exchange the value
safely in email or other non-binary environments.
hmac.copy()~
Return a copy ("clone") of the hmac object. This can be used to efficiently
compute the digests of strings that share a common initial substring.
.. seealso::
Module hashlib (|py2stdlib-hashlib|)
The Python module providing secure hash functions.
==============================================================================
*py2stdlib-hotshot*
hotshot~
:synopsis: High performance logging profiler, mostly written in C.
.. versionadded:: 2.2
This module provides a nicer interface to the _hotshot C module. Hotshot
is a replacement for the existing profile (|py2stdlib-profile|) module. As it's written mostly
in C, it should result in a much smaller performance impact than the existing
profile (|py2stdlib-profile|) module.
.. note::
The hotshot (|py2stdlib-hotshot|) module focuses on minimizing the overhead while profiling, at
the expense of long data post-processing times. For common usage it is
recommended to use cProfile (|py2stdlib-cprofile|) instead. hotshot (|py2stdlib-hotshot|) is not maintained and
might be removed from the standard library in the future.
.. versionchanged:: 2.5
The results should be more meaningful than in the past: the timing core
contained a critical bug.
.. note::
The hotshot (|py2stdlib-hotshot|) profiler does not yet work well with threads. It is useful to
use an unthreaded script to run the profiler over the code you're interested in
measuring if at all possible.
Profile(logfile[, lineevents[, linetimings]])~
The profiler object. The argument {logfile} is the name of a log file to use for
logged profile data. The argument {lineevents} specifies whether to generate
events for every source line, or just on function call/return. It defaults to
``0`` (only log function call/return). The argument {linetimings} specifies
whether to record timing information. It defaults to ``1`` (store timing
information).
Profile Objects
---------------
Profile objects have the following methods:
Profile.addinfo(key, value)~
Add an arbitrary labelled value to the profile output.
Profile.close()~
Close the logfile and terminate the profiler.
Profile.fileno()~
Return the file descriptor of the profiler's log file.
Profile.run(cmd)~
Profile an exec\ -compatible string in the script environment. The
globals from the __main__ (|py2stdlib-__main__|) module are used as both the globals and locals
for the script.
Profile.runcall(func, {args, }*keywords)~
Profile a single call of a callable. Additional positional and keyword arguments
may be passed along; the result of the call is returned, and exceptions are
allowed to propagate cleanly, while ensuring that profiling is disabled on the
way out.
Profile.runctx(cmd, globals, locals)~
Evaluate an exec\ -compatible string in a specific environment. The
string is compiled before profiling begins.
Profile.start()~
Start the profiler.
Profile.stop()~
Stop the profiler.
Using hotshot data
------------------
==============================================================================
*py2stdlib-hotshot.stats*
hotshot.stats~
:synopsis: Statistical analysis for Hotshot
.. versionadded:: 2.2
This module loads hotshot profiling data into the standard pstats (|py2stdlib-pstats|) Stats
objects.
load(filename)~
Load hotshot data from {filename}. Returns an instance of the
pstats.Stats class.
.. seealso::
Module profile (|py2stdlib-profile|)
The profile (|py2stdlib-profile|) module's Stats class
Example Usage
-------------
Note that this example runs the Python "benchmark" pystones. It can take some
time to run, and will produce large output files. :: >
>>> import hotshot, hotshot.stats, test.pystone
>>> prof = hotshot.Profile("stones.prof")
>>> benchtime, stones = prof.runcall(test.pystone.pystones)
>>> prof.close()
>>> stats = hotshot.stats.load("stones.prof")
>>> stats.strip_dirs()
>>> stats.sort_stats('time', 'calls')
>>> stats.print_stats(20)
850004 function calls in 10.090 CPU seconds
Ordered by: internal time, call count
ncalls tottime percall cumtime percall filename:lineno(function)
1 3.295 3.295 10.090 10.090 pystone.py:79(Proc0)
150000 1.315 0.000 1.315 0.000 pystone.py:203(Proc7)
50000 1.313 0.000 1.463 0.000 pystone.py:229(Func2)
.
.
.
==============================================================================
*py2stdlib-htmllib*
htmllib~
:synopsis: A parser for HTML documents.
:deprecated:
2.6~
The htmllib (|py2stdlib-htmllib|) module has been removed in Python 3.0.
.. index::
single: HTML
single: hypertext
.. index::
module: sgmllib
module: formatter
single: SGMLParser (in module sgmllib)
This module defines a class which can serve as a base for parsing text files
formatted in the HyperText Mark-up Language (HTML). The class is not directly
concerned with I/O --- it must be provided with input in string form via a
method, and makes calls to methods of a "formatter" object in order to produce
output. The HTMLParser (|py2stdlib-htmlparser|) class is designed to be used as a base class
for other classes in order to add functionality, and allows most of its methods
to be extended or overridden. In turn, this class is derived from and extends
the SGMLParser class defined in module sgmllib (|py2stdlib-sgmllib|). The
HTMLParser (|py2stdlib-htmlparser|) implementation supports the HTML 2.0 language as described
in 1866. Two implementations of formatter objects are provided in the
formatter (|py2stdlib-formatter|) module; refer to the documentation for that module for
information on the formatter interface.
The following is a summary of the interface defined by
sgmllib.SGMLParser:
* The interface to feed data to an instance is through the feed method,
which takes a string argument. This can be called with as little or as much
text at a time as desired; ``p.feed(a); p.feed(b)`` has the same effect as
``p.feed(a+b)``. When the data contains complete HTML markup constructs, these
are processed immediately; incomplete constructs are saved in a buffer. To
force processing of all unprocessed data, call the close method.
For example, to parse the entire contents of a file, use:: >
parser.feed(open('myfile.html').read())
parser.close()
<
* The interface to define semantics for HTML tags is very simple: derive a class
and define methods called start_tag, end_tag, or do_tag.
The parser will call these at appropriate moments: start_tag or
do_tag is called when an opening tag of the form ``<tag ...>`` is
encountered; end_tag is called when a closing tag of the form ``<tag>``
is encountered. If an opening tag requires a corresponding closing tag, like
``<H1>`` ... ``</H1>``, the class should define the start_tag method; if
a tag requires no closing tag, like ``<P>``, the class should define the
do_tag method.
The module defines a parser class and an exception:
HTMLParser(formatter)~
This is the basic HTML parser class. It supports all entity names required by
the XHTML 1.0 Recommendation (http://www.w3.org/TR/xhtml1). It also defines
handlers for all HTML 2.0 and many HTML 3.0 and 3.2 elements.
HTMLParseError~
Exception raised by the HTMLParser (|py2stdlib-htmlparser|) class when it encounters an error
while parsing.
.. versionadded:: 2.4
.. seealso::
Module formatter (|py2stdlib-formatter|)
Interface definition for transforming an abstract flow of formatting events into
specific output events on writer objects.
Module HTMLParser (|py2stdlib-htmlparser|)
Alternate HTML parser that offers a slightly lower-level view of the input, but
is designed to work with XHTML, and does not implement some of the SGML syntax
not used in "HTML as deployed" and which isn't legal for XHTML.
Module htmlentitydefs (|py2stdlib-htmlentitydefs|)
Definition of replacement text for XHTML 1.0 entities.
Module sgmllib (|py2stdlib-sgmllib|)
Base class for HTMLParser (|py2stdlib-htmlparser|).
HTMLParser Objects
------------------
In addition to tag methods, the HTMLParser (|py2stdlib-htmlparser|) class provides some
additional methods and instance variables for use within tag methods.
HTMLParser.formatter~
This is the formatter instance associated with the parser.
HTMLParser.nofill~
Boolean flag which should be true when whitespace should not be collapsed, or
false when it should be. In general, this should only be true when character
data is to be treated as "preformatted" text, as within a ``<PRE>`` element.
The default value is false. This affects the operation of handle_data
and save_end.
HTMLParser.anchor_bgn(href, name, type)~
This method is called at the start of an anchor region. The arguments
correspond to the attributes of the ``<A>`` tag with the same names. The
default implementation maintains a list of hyperlinks (defined by the ``HREF``
attribute for ``<A>`` tags) within the document. The list of hyperlinks is
available as the data attribute anchorlist.
HTMLParser.anchor_end()~
This method is called at the end of an anchor region. The default
implementation adds a textual footnote marker using an index into the list of
hyperlinks created by anchor_bgn.
HTMLParser.handle_image(source, alt[, ismap[, align[, width[, height]]]])~
This method is called to handle images. The default implementation simply
passes the {alt} value to the handle_data method.
HTMLParser.save_bgn()~
Begins saving character data in a buffer instead of sending it to the formatter
object. Retrieve the stored data via save_end. Use of the
save_bgn / save_end pair may not be nested.
HTMLParser.save_end()~
Ends buffering character data and returns all data saved since the preceding
call to save_bgn. If the nofill flag is false, whitespace is
collapsed to single spaces. A call to this method without a preceding call to
save_bgn will raise a TypeError exception.
htmlentitydefs (|py2stdlib-htmlentitydefs|) --- Definitions of HTML general entities
==============================================================
==============================================================================
*py2stdlib-htmlentitydefs*
htmlentitydefs~
:synopsis: Definitions of HTML general entities.
.. note::
The htmlentitydefs (|py2stdlib-htmlentitydefs|) module has been renamed to html.entities in
Python 3.0. The 2to3 tool will automatically adapt imports when
converting your sources to 3.0.
This module defines three dictionaries, ``name2codepoint``, ``codepoint2name``,
and ``entitydefs``. ``entitydefs`` is used by the htmllib (|py2stdlib-htmllib|) module to
provide the entitydefs member of the HTMLParser (|py2stdlib-htmlparser|) class. The
definition provided here contains all the entities defined by XHTML 1.0 that
can be handled using simple textual substitution in the Latin-1 character set
(ISO-8859-1).
entitydefs~
A dictionary mapping XHTML 1.0 entity definitions to their replacement text in
ISO Latin-1.
name2codepoint~
A dictionary that maps HTML entity names to the Unicode codepoints.
.. versionadded:: 2.3
codepoint2name~
A dictionary that maps Unicode codepoints to HTML entity names.
.. versionadded:: 2.3
==============================================================================
*py2stdlib-htmlparser*
HTMLParser~
:synopsis: A simple parser that can handle HTML and XHTML.
.. note::
The HTMLParser (|py2stdlib-htmlparser|) module has been renamed to html.parser in Python
3.0. The 2to3 tool will automatically adapt imports when converting
your sources to 3.0.
.. versionadded:: 2.2
.. index::
single: HTML
single: XHTML
This module defines a class HTMLParser (|py2stdlib-htmlparser|) which serves as the basis for
parsing text files formatted in HTML (HyperText Mark-up Language) and XHTML.
Unlike the parser in htmllib (|py2stdlib-htmllib|), this parser is not based on the SGML parser
in sgmllib (|py2stdlib-sgmllib|).
HTMLParser()~
The HTMLParser (|py2stdlib-htmlparser|) class is instantiated without arguments.
An HTMLParser (|py2stdlib-htmlparser|) instance is fed HTML data and calls handler functions when tags
begin and end. The HTMLParser (|py2stdlib-htmlparser|) class is meant to be overridden by the
user to provide a desired behavior.
Unlike the parser in htmllib (|py2stdlib-htmllib|), this parser does not check that end tags
match start tags or call the end-tag handler for elements which are closed
implicitly by closing an outer element.
An exception is defined as well:
HTMLParseError~
Exception raised by the HTMLParser (|py2stdlib-htmlparser|) class when it encounters an error
while parsing. This exception provides three attributes: msg is a brief
message explaining the error, lineno is the number of the line on which
the broken construct was detected, and offset is the number of
characters into the line at which the construct starts.
HTMLParser (|py2stdlib-htmlparser|) instances have the following methods:
HTMLParser.reset()~
Reset the instance. Loses all unprocessed data. This is called implicitly at
instantiation time.
HTMLParser.feed(data)~
Feed some text to the parser. It is processed insofar as it consists of
complete elements; incomplete data is buffered until more data is fed or
close is called.
HTMLParser.close()~
Force processing of all buffered data as if it were followed by an end-of-file
mark. This method may be redefined by a derived class to define additional
processing at the end of the input, but the redefined version should always call
the HTMLParser (|py2stdlib-htmlparser|) base class method close.
HTMLParser.getpos()~
Return current line number and offset.
HTMLParser.get_starttag_text()~
Return the text of the most recently opened start tag. This should not normally
be needed for structured processing, but may be useful in dealing with HTML "as
deployed" or for re-generating input with minimal changes (whitespace between
attributes can be preserved, etc.).
HTMLParser.handle_starttag(tag, attrs)~
This method is called to handle the start of a tag. It is intended to be
overridden by a derived class; the base class implementation does nothing.
The {tag} argument is the name of the tag converted to lower case. The {attrs}
argument is a list of ``(name, value)`` pairs containing the attributes found
inside the tag's ``<>`` brackets. The {name} will be translated to lower case,
and quotes in the {value} have been removed, and character and entity references
have been replaced. For instance, for the tag ``<A
HREF="http://www.cwi.nl/">``, this method would be called as
``handle_starttag('a', [('href', 'http://www.cwi.nl/')])``.
.. versionchanged:: 2.6
All entity references from htmlentitydefs (|py2stdlib-htmlentitydefs|) are now replaced in the attribute
values.
HTMLParser.handle_startendtag(tag, attrs)~
Similar to handle_starttag, but called when the parser encounters an
XHTML-style empty tag (``<a .../>``). This method may be overridden by
subclasses which require this particular lexical information; the default
implementation simple calls handle_starttag and handle_endtag.
HTMLParser.handle_endtag(tag)~
This method is called to handle the end tag of an element. It is intended to be
overridden by a derived class; the base class implementation does nothing. The
{tag} argument is the name of the tag converted to lower case.
HTMLParser.handle_data(data)~
This method is called to process arbitrary data. It is intended to be
overridden by a derived class; the base class implementation does nothing.
HTMLParser.handle_charref(name)~
This method is called to process a character reference of the form ``&#ref;``.
It is intended to be overridden by a derived class; the base class
implementation does nothing.
HTMLParser.handle_entityref(name)~
This method is called to process a general entity reference of the form
``&name;`` where {name} is an general entity reference. It is intended to be
overridden by a derived class; the base class implementation does nothing.
HTMLParser.handle_comment(data)~
This method is called when a comment is encountered. The {comment} argument is
a string containing the text between the ``--`` and ``--`` delimiters, but not
the delimiters themselves. For example, the comment ``<!--text-->`` will cause
this method to be called with the argument ``'text'``. It is intended to be
overridden by a derived class; the base class implementation does nothing.
HTMLParser.handle_decl(decl)~
Method called when an SGML declaration is read by the parser. The {decl}
parameter will be the entire contents of the declaration inside the ``<!``...\
``>`` markup. It is intended to be overridden by a derived class; the base
class implementation does nothing.
HTMLParser.handle_pi(data)~
Method called when a processing instruction is encountered. The {data}
parameter will contain the entire processing instruction. For example, for the
processing instruction ``<?proc color='red'>``, this method would be called as
``handle_pi("proc color='red'")``. It is intended to be overridden by a derived
class; the base class implementation does nothing.
.. note:: >
The HTMLParser (|py2stdlib-htmlparser|) class uses the SGML syntactic rules for processing
instructions. An XHTML processing instruction using the trailing ``'?'`` will
cause the ``'?'`` to be included in {data}.
<
Example HTML Parser Application
As a basic example, below is a very basic HTML parser that uses the
HTMLParser (|py2stdlib-htmlparser|) class to print out tags as they are encountered:: >
from HTMLParser import HTMLParser
class MyHTMLParser(HTMLParser):
def handle_starttag(self, tag, attrs):
print "Encountered the beginning of a %s tag" % tag
def handle_endtag(self, tag):
print "Encountered the end of a %s tag" % tag
==============================================================================
*py2stdlib-httplib*
httplib~
:synopsis: HTTP and HTTPS protocol client (requires sockets).
.. note::
The httplib (|py2stdlib-httplib|) module has been renamed to http.client in Python
3.0. The 2to3 tool will automatically adapt imports when converting
your sources to 3.0.
.. index::
pair: HTTP; protocol
single: HTTP; httplib (standard module)
.. index:: module: urllib
This module defines classes which implement the client side of the HTTP and
HTTPS protocols. It is normally not used directly --- the module urllib (|py2stdlib-urllib|)
uses it to handle URLs that use HTTP and HTTPS.
.. note::
HTTPS support is only available if the socket (|py2stdlib-socket|) module was compiled with
SSL support.
.. note::
The public interface for this module changed substantially in Python 2.0. The
HTTP class is retained only for backward compatibility with 1.5.2. It
should not be used in new code. Refer to the online docstrings for usage.
The module provides the following classes:
HTTPConnection(host[, port[, strict[, timeout[, source_address]]]])~
An HTTPConnection instance represents one transaction with an HTTP
server. It should be instantiated passing it a host and optional port
number. If no port number is passed, the port is extracted from the host
string if it has the form ``host:port``, else the default HTTP port (80) is
used. When True, the optional parameter {strict} (which defaults to a false
value) causes ``BadStatusLine`` to
be raised if the status line can't be parsed as a valid HTTP/1.0 or 1.1
status line. If the optional {timeout} parameter is given, blocking
operations (like connection attempts) will timeout after that many seconds
(if it is not given, the global default timeout setting is used).
The optional {source_address} parameter may be a tuple of a (host, port)
to use as the source address the HTTP connection is made from.
For example, the following calls all create instances that connect to the server
at the same host and port:: >
>>> h1 = httplib.HTTPConnection('www.cwi.nl')
>>> h2 = httplib.HTTPConnection('www.cwi.nl:80')
>>> h3 = httplib.HTTPConnection('www.cwi.nl', 80)
>>> h3 = httplib.HTTPConnection('www.cwi.nl', 80, timeout=10)
<
.. versionadded:: 2.0
.. versionchanged:: 2.6
{timeout} was added.
.. versionchanged:: 2.7
{source_address} was added.
HTTPSConnection(host[, port[, key_file[, cert_file[, strict[, timeout[, source_address]]]]]])~
A subclass of HTTPConnection that uses SSL for communication with
secure servers. Default port is ``443``. {key_file} is the name of a PEM
formatted file that contains your private key. {cert_file} is a PEM formatted
certificate chain file.
.. note:: >
This does not do any certificate verification.
<
.. versionadded:: 2.0
.. versionchanged:: 2.6
{timeout} was added.
.. versionchanged:: 2.7
{source_address} was added.
HTTPResponse(sock[, debuglevel=0][, strict=0])~
Class whose instances are returned upon successful connection. Not instantiated
directly by user.
.. versionadded:: 2.0
HTTPMessage~
An HTTPMessage instance is used to hold the headers from an HTTP
response. It is implemented using the mimetools.Message class and
provides utility functions to deal with HTTP Headers. It is not directly
instantiated by the users.
The following exceptions are raised as appropriate:
HTTPException~
The base class of the other exceptions in this module. It is a subclass of
Exception.
.. versionadded:: 2.0
NotConnected~
A subclass of HTTPException.
.. versionadded:: 2.0
InvalidURL~
A subclass of HTTPException, raised if a port is given and is either
non-numeric or empty.
.. versionadded:: 2.3
UnknownProtocol~
A subclass of HTTPException.
.. versionadded:: 2.0
UnknownTransferEncoding~
A subclass of HTTPException.
.. versionadded:: 2.0
UnimplementedFileMode~
A subclass of HTTPException.
.. versionadded:: 2.0
IncompleteRead~
A subclass of HTTPException.
.. versionadded:: 2.0
ImproperConnectionState~
A subclass of HTTPException.
.. versionadded:: 2.0
CannotSendRequest~
A subclass of ImproperConnectionState.
.. versionadded:: 2.0
CannotSendHeader~
A subclass of ImproperConnectionState.
.. versionadded:: 2.0
ResponseNotReady~
A subclass of ImproperConnectionState.
.. versionadded:: 2.0
BadStatusLine~
A subclass of HTTPException. Raised if a server responds with a HTTP
status code that we don't understand.
.. versionadded:: 2.0
The constants defined in this module are:
HTTP_PORT~
The default port for the HTTP protocol (always ``80``).
HTTPS_PORT~
The default port for the HTTPS protocol (always ``443``).
and also the following constants for integer status codes:
+------------------------------------------+---------+-----------------------------------------------------------------------+
| Constant | Value | Definition |
+==========================================+=========+=======================================================================+
| CONTINUE | ``100`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.1.1 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.1.1>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| SWITCHING_PROTOCOLS | ``101`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.1.2 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.1.2>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| PROCESSING | ``102`` | WEBDAV, `RFC 2518, Section 10.1 |
| | | <http://www.webdav.org/specs/rfc2518.html#STATUS_102>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| OK | ``200`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.2.1 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| CREATED | ``201`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.2.2 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.2>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| ACCEPTED | ``202`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.2.3 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.3>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| NON_AUTHORITATIVE_INFORMATION | ``203`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.2.4 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.4>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| NO_CONTENT | ``204`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.2.5 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.5>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| RESET_CONTENT | ``205`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.2.6 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.6>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| PARTIAL_CONTENT | ``206`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.2.7 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.7>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| MULTI_STATUS | ``207`` | WEBDAV `RFC 2518, Section 10.2 |
| | | <http://www.webdav.org/specs/rfc2518.html#STATUS_207>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| IM_USED | ``226`` | Delta encoding in HTTP, |
| | | 3229, Section 10.4.1 |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| MULTIPLE_CHOICES | ``300`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.3.1 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.1>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| MOVED_PERMANENTLY | ``301`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.3.2 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.2>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| FOUND | ``302`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.3.3 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.3>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| SEE_OTHER | ``303`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.3.4 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.4>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| NOT_MODIFIED | ``304`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.3.5 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.5>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| USE_PROXY | ``305`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.3.6 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.6>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| TEMPORARY_REDIRECT | ``307`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.3.8 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.8>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| BAD_REQUEST | ``400`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.4.1 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| UNAUTHORIZED | ``401`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.4.2 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| PAYMENT_REQUIRED | ``402`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.4.3 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.3>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| FORBIDDEN | ``403`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.4.4 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| NOT_FOUND | ``404`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.4.5 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.5>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| METHOD_NOT_ALLOWED | ``405`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.4.6 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.6>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| NOT_ACCEPTABLE | ``406`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.4.7 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.7>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| PROXY_AUTHENTICATION_REQUIRED | ``407`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.4.8 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.8>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| REQUEST_TIMEOUT | ``408`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.4.9 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.9>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| CONFLICT | ``409`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.4.10 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.10>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| GONE | ``410`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.4.11 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.11>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| LENGTH_REQUIRED | ``411`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.4.12 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.12>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| PRECONDITION_FAILED | ``412`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.4.13 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.13>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| REQUEST_ENTITY_TOO_LARGE | ``413`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.4.14 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.14>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| REQUEST_URI_TOO_LONG | ``414`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.4.15 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.15>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| UNSUPPORTED_MEDIA_TYPE | ``415`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.4.16 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.16>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| REQUESTED_RANGE_NOT_SATISFIABLE | ``416`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.4.17 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.17>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| EXPECTATION_FAILED | ``417`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.4.18 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.18>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| UNPROCESSABLE_ENTITY | ``422`` | WEBDAV, `RFC 2518, Section 10.3 |
| | | <http://www.webdav.org/specs/rfc2518.html#STATUS_422>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| LOCKED | ``423`` | WEBDAV `RFC 2518, Section 10.4 |
| | | <http://www.webdav.org/specs/rfc2518.html#STATUS_423>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| FAILED_DEPENDENCY | ``424`` | WEBDAV, `RFC 2518, Section 10.5 |
| | | <http://www.webdav.org/specs/rfc2518.html#STATUS_424>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| UPGRADE_REQUIRED | ``426`` | HTTP Upgrade to TLS, |
| | | 2817, Section 6 |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| INTERNAL_SERVER_ERROR | ``500`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.5.1 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.5.1>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| NOT_IMPLEMENTED | ``501`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.5.2 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.5.2>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| BAD_GATEWAY | ``502`` | HTTP/1.1 `RFC 2616, Section |
| | | 10.5.3 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.5.3>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| SERVICE_UNAVAILABLE | ``503`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.5.4 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.5.4>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| GATEWAY_TIMEOUT | ``504`` | HTTP/1.1 `RFC 2616, Section |
| | | 10.5.5 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.5.5>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| HTTP_VERSION_NOT_SUPPORTED | ``505`` | HTTP/1.1, `RFC 2616, Section |
| | | 10.5.6 |
| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.5.6>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| INSUFFICIENT_STORAGE | ``507`` | WEBDAV, `RFC 2518, Section 10.6 |
| | | <http://www.webdav.org/specs/rfc2518.html#STATUS_507>`_ |
+------------------------------------------+---------+-----------------------------------------------------------------------+
| NOT_EXTENDED | ``510`` | An HTTP Extension Framework, |
| | | 2774, Section 7 |
+------------------------------------------+---------+-----------------------------------------------------------------------+
responses~
This dictionary maps the HTTP 1.1 status codes to the W3C names.
Example: ``httplib.responses[httplib.NOT_FOUND]`` is ``'Not Found'``.
.. versionadded:: 2.5
HTTPConnection Objects
----------------------
HTTPConnection instances have the following methods:
HTTPConnection.request(method, url[, body[, headers]])~
This will send a request to the server using the HTTP request method {method}
and the selector {url}. If the {body} argument is present, it should be a
string of data to send after the headers are finished. Alternatively, it may
be an open file object, in which case the contents of the file is sent; this
file object should support ``fileno()`` and ``read()`` methods. The header
Content-Length is automatically set to the correct value. The {headers}
argument should be a mapping of extra HTTP headers to send with the request.
.. versionchanged:: 2.6
{body} can be a file object.
HTTPConnection.getresponse()~
Should be called after a request is sent to get the response from the server.
Returns an HTTPResponse instance.
.. note:: >
Note that you must have read the whole response before you can send a new
request to the server.
<
HTTPConnection.set_debuglevel(level)~
Set the debugging level (the amount of debugging output printed). The default
debug level is ``0``, meaning no debugging output is printed.
HTTPConnection.set_tunnel(host,port=None, headers=None)~
Set the host and the port for HTTP Connect Tunnelling. Normally used when
it is required to do HTTPS Conection through a proxy server.
The headers argument should be a mapping of extra HTTP headers to to sent
with the CONNECT request.
.. versionadded:: 2.7
HTTPConnection.connect()~
Connect to the server specified when the object was created.
HTTPConnection.close()~
Close the connection to the server.
As an alternative to using the request method described above, you can
also send your request step by step, by using the four functions below.
HTTPConnection.putrequest(request, selector[, skip_host[, skip_accept_encoding]])~
This should be the first call after the connection to the server has been made.
It sends a line to the server consisting of the {request} string, the {selector}
string, and the HTTP version (``HTTP/1.1``). To disable automatic sending of
``Host:`` or ``Accept-Encoding:`` headers (for example to accept additional
content encodings), specify {skip_host} or {skip_accept_encoding} with non-False
values.
.. versionchanged:: 2.4
{skip_accept_encoding} argument added.
HTTPConnection.putheader(header, argument[, ...])~
Send an 822\ -style header to the server. It sends a line to the server
consisting of the header, a colon and a space, and the first argument. If more
arguments are given, continuation lines are sent, each consisting of a tab and
an argument.
HTTPConnection.endheaders()~
Send a blank line to the server, signalling the end of the headers.
HTTPConnection.send(data)~
Send data to the server. This should be used directly only after the
endheaders method has been called and before getresponse is
called.
HTTPResponse Objects
--------------------
HTTPResponse instances have the following methods and attributes:
HTTPResponse.read([amt])~
Reads and returns the response body, or up to the next {amt} bytes.
HTTPResponse.getheader(name[, default])~
Get the contents of the header {name}, or {default} if there is no matching
header.
HTTPResponse.getheaders()~
Return a list of (header, value) tuples.
.. versionadded:: 2.4
HTTPResponse.msg~
A mimetools.Message instance containing the response headers.
HTTPResponse.version~
HTTP protocol version used by server. 10 for HTTP/1.0, 11 for HTTP/1.1.
HTTPResponse.status~
Status code returned by server.
HTTPResponse.reason~
Reason phrase returned by server.
Examples
--------
Here is an example session that uses the ``GET`` method:: >
>>> import httplib
>>> conn = httplib.HTTPConnection("www.python.org")
>>> conn.request("GET", "/index.html")
>>> r1 = conn.getresponse()
>>> print r1.status, r1.reason
200 OK
>>> data1 = r1.read()
>>> conn.request("GET", "/parrot.spam")
>>> r2 = conn.getresponse()
>>> print r2.status, r2.reason
404 Not Found
>>> data2 = r2.read()
>>> conn.close()
<
Here is an example session that uses the ``HEAD`` method. Note that the
``HEAD`` method never returns any data. :: >
>>> import httplib
>>> conn = httplib.HTTPConnection("www.python.org")
>>> conn.request("HEAD","/index.html")
>>> res = conn.getresponse()
>>> print res.status, res.reason
200 OK
>>> data = res.read()
>>> print len(data)
0
>>> data == ''
True
<
Here is an example session that shows how to ``POST`` requests::
>>> import httplib, urllib
>>> params = urllib.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})
>>> headers = {"Content-type": "application/x-www-form-urlencoded",
... "Accept": "text/plain"}
>>> conn = httplib.HTTPConnection("musi-cal.mojam.com:80")
>>> conn.request("POST", "/cgi-bin/query", params, headers)
>>> response = conn.getresponse()
>>> print response.status, response.reason
200 OK
>>> data = response.read()
>>> conn.close()
==============================================================================
*py2stdlib-ic*
ic~
:platform: Mac
:synopsis: Access to the Mac OS X Internet Config.
:deprecated:
This module provides access to various internet-related preferences set through
System Preferences or the Finder.
.. note::
This module has been removed in Python 3.x.
.. index:: module: icglue
There is a low-level companion module icglue which provides the basic
Internet Config access functionality. This low-level module is not documented,
but the docstrings of the routines document the parameters and the routine names
are the same as for the Pascal or C API to Internet Config, so the standard IC
programmers' documentation can be used if this module is needed.
The ic (|py2stdlib-ic|) module defines the error exception and symbolic names for
all error codes Internet Config can produce; see the source for details.
error~
Exception raised on errors in the ic (|py2stdlib-ic|) module.
The ic (|py2stdlib-ic|) module defines the following class and function:
IC([signature[, ic]])~
Create an Internet Config object. The signature is a 4-character creator code of
the current application (default ``'Pyth'``) which may influence some of ICs
settings. The optional {ic} argument is a low-level ``icglue.icinstance``
created beforehand, this may be useful if you want to get preferences from a
different config file, etc.
launchurl(url[, hint])~
parseurl(data[, start[, end[, hint]]])
mapfile(file)
maptypecreator(type, creator[, filename])
settypecreator(file)
These functions are "shortcuts" to the methods of the same name, described
below.
IC Objects
----------
IC objects have a mapping interface, hence to obtain the mail address
you simply get ``ic['MailAddress']``. Assignment also works, and changes the
option in the configuration file.
The module knows about various datatypes, and converts the internal IC
representation to a "logical" Python data structure. Running the ic (|py2stdlib-ic|)
module standalone will run a test program that lists all keys and values in your
IC database, this will have to serve as documentation.
If the module does not know how to represent the data it returns an instance of
the ``ICOpaqueData`` type, with the raw data in its data attribute.
Objects of this type are also acceptable values for assignment.
Besides the dictionary interface, IC objects have the following
methods:
IC.launchurl(url[, hint])~
Parse the given URL, launch the correct application and pass it the URL. The
optional {hint} can be a scheme name such as ``'mailto:'``, in which case
incomplete URLs are completed with this scheme. If {hint} is not provided,
incomplete URLs are invalid.
IC.parseurl(data[, start[, end[, hint]]])~
Find an URL somewhere in {data} and return start position, end position and the
URL. The optional {start} and {end} can be used to limit the search, so for
instance if a user clicks in a long text field you can pass the whole text field
and the click-position in {start} and this routine will return the whole URL in
which the user clicked. As above, {hint} is an optional scheme used to complete
incomplete URLs.
IC.mapfile(file)~
Return the mapping entry for the given {file}, which can be passed as either a
filename or an FSSpec result, and which need not exist.
The mapping entry is returned as a tuple ``(version, type, creator, postcreator,
flags, extension, appname, postappname, mimetype, entryname)``, where {version}
is the entry version number, {type} is the 4-character filetype, {creator} is
the 4-character creator type, {postcreator} is the 4-character creator code of
an optional application to post-process the file after downloading, {flags} are
various bits specifying whether to transfer in binary or ascii and such,
{extension} is the filename extension for this file type, {appname} is the
printable name of the application to which this file belongs, {postappname} is
the name of the postprocessing application, {mimetype} is the MIME type of this
file and {entryname} is the name of this entry.
IC.maptypecreator(type, creator[, filename])~
Return the mapping entry for files with given 4-character {type} and {creator}
codes. The optional {filename} may be specified to further help finding the
correct entry (if the creator code is ``'????'``, for instance).
The mapping entry is returned in the same format as for {mapfile}.
IC.settypecreator(file)~
Given an existing {file}, specified either as a filename or as an FSSpec
result, set its creator and type correctly based on its extension. The finder
is told about the change, so the finder icon will be updated quickly.
==============================================================================
*py2stdlib-imageop*
imageop~
:synopsis: Manipulate raw image data.
:deprecated:
2.6~
The imageop (|py2stdlib-imageop|) module has been removed in Python 3.0.
The imageop (|py2stdlib-imageop|) module contains some useful operations on images. It operates
on images consisting of 8 or 32 bit pixels stored in Python strings. This is
the same format as used by gl.lrectwrite and the imgfile (|py2stdlib-imgfile|) module.
The module defines the following variables and functions:
error~
This exception is raised on all errors, such as unknown number of bits per
pixel, etc.
crop(image, psize, width, height, x0, y0, x1, y1)~
Return the selected part of {image}, which should be {width} by {height} in size
and consist of pixels of {psize} bytes. {x0}, {y0}, {x1} and {y1} are like the
gl.lrectread parameters, i.e. the boundary is included in the new image.
The new boundaries need not be inside the picture. Pixels that fall outside the
old image will have their value set to zero. If {x0} is bigger than {x1} the
new image is mirrored. The same holds for the y coordinates.
scale(image, psize, width, height, newwidth, newheight)~
Return {image} scaled to size {newwidth} by {newheight}. No interpolation is
done, scaling is done by simple-minded pixel duplication or removal. Therefore,
computer-generated images or dithered images will not look nice after scaling.
tovideo(image, psize, width, height)~
Run a vertical low-pass filter over an image. It does so by computing each
destination pixel as the average of two vertically-aligned source pixels. The
main use of this routine is to forestall excessive flicker if the image is
displayed on a video device that uses interlacing, hence the name.
grey2mono(image, width, height, threshold)~
Convert a 8-bit deep greyscale image to a 1-bit deep image by thresholding all
the pixels. The resulting image is tightly packed and is probably only useful
as an argument to mono2grey.
dither2mono(image, width, height)~
Convert an 8-bit greyscale image to a 1-bit monochrome image using a
(simple-minded) dithering algorithm.
mono2grey(image, width, height, p0, p1)~
Convert a 1-bit monochrome image to an 8 bit greyscale or color image. All
pixels that are zero-valued on input get value {p0} on output and all one-value
input pixels get value {p1} on output. To convert a monochrome black-and-white
image to greyscale pass the values ``0`` and ``255`` respectively.
grey2grey4(image, width, height)~
Convert an 8-bit greyscale image to a 4-bit greyscale image without dithering.
grey2grey2(image, width, height)~
Convert an 8-bit greyscale image to a 2-bit greyscale image without dithering.
dither2grey2(image, width, height)~
Convert an 8-bit greyscale image to a 2-bit greyscale image with dithering. As
for dither2mono, the dithering algorithm is currently very simple.
grey42grey(image, width, height)~
Convert a 4-bit greyscale image to an 8-bit greyscale image.
grey22grey(image, width, height)~
Convert a 2-bit greyscale image to an 8-bit greyscale image.
backward_compatible~
If set to 0, the functions in this module use a non-backward compatible way
of representing multi-byte pixels on little-endian systems. The SGI for
which this module was originally written is a big-endian system, so setting
this variable will have no effect. However, the code wasn't originally
intended to run on anything else, so it made assumptions about byte order
which are not universal. Setting this variable to 0 will cause the byte
order to be reversed on little-endian systems, so that it then is the same as
on big-endian systems.
==============================================================================
*py2stdlib-imaplib*
imaplib~
:synopsis: IMAP4 protocol client (requires sockets).
.. revised by ESR, January 2000
.. changes for IMAP4_SSL by Tino Lange <Tino.Lange@isg.de>, March 2002
.. changes for IMAP4_stream by Piers Lauder <piers@communitysolutions.com.au>,
November 2002
.. index::
pair: IMAP4; protocol
pair: IMAP4_SSL; protocol
pair: IMAP4_stream; protocol
This module defines three classes, IMAP4, IMAP4_SSL and
IMAP4_stream, which encapsulate a connection to an IMAP4 server and
implement a large subset of the IMAP4rev1 client protocol as defined in
2060. It is backward compatible with IMAP4 (1730) servers, but
note that the ``STATUS`` command is not supported in IMAP4.
Three classes are provided by the imaplib (|py2stdlib-imaplib|) module, IMAP4 is the
base class:
IMAP4([host[, port]])~
This class implements the actual IMAP4 protocol. The connection is created and
protocol version (IMAP4 or IMAP4rev1) is determined when the instance is
initialized. If {host} is not specified, ``''`` (the local host) is used. If
{port} is omitted, the standard IMAP4 port (143) is used.
Three exceptions are defined as attributes of the IMAP4 class:
IMAP4.error~
Exception raised on any errors. The reason for the exception is passed to the
constructor as a string.
IMAP4.abort~
IMAP4 server errors cause this exception to be raised. This is a sub-class of
IMAP4.error. Note that closing the instance and instantiating a new one
will usually allow recovery from this exception.
IMAP4.readonly~
This exception is raised when a writable mailbox has its status changed by the
server. This is a sub-class of IMAP4.error. Some other client now has
write permission, and the mailbox will need to be re-opened to re-obtain write
permission.
There's also a subclass for secure connections:
IMAP4_SSL([host[, port[, keyfile[, certfile]]]])~
This is a subclass derived from IMAP4 that connects over an SSL
encrypted socket (to use this class you need a socket module that was compiled
with SSL support). If {host} is not specified, ``''`` (the local host) is used.
If {port} is omitted, the standard IMAP4-over-SSL port (993) is used. {keyfile}
and {certfile} are also optional - they can contain a PEM formatted private key
and certificate chain file for the SSL connection.
The second subclass allows for connections created by a child process:
IMAP4_stream(command)~
This is a subclass derived from IMAP4 that connects to the
``stdin/stdout`` file descriptors created by passing {command} to
``os.popen2()``.
.. versionadded:: 2.3
The following utility functions are defined:
Internaldate2tuple(datestr)~
Converts an IMAP4 INTERNALDATE string to Coordinated Universal Time. Returns a
time (|py2stdlib-time|) module tuple.
Int2AP(num)~
Converts an integer into a string representation using characters from the set
[``A`` .. ``P``].
ParseFlags(flagstr)~
Converts an IMAP4 ``FLAGS`` response to a tuple of individual flags.
Time2Internaldate(date_time)~
Converts a time (|py2stdlib-time|) module tuple to an IMAP4 ``INTERNALDATE`` representation.
Returns a string in the form: ``"DD-Mmm-YYYY HH:MM:SS +HHMM"`` (including
double-quotes).
Note that IMAP4 message numbers change as the mailbox changes; in particular,
after an ``EXPUNGE`` command performs deletions the remaining messages are
renumbered. So it is highly advisable to use UIDs instead, with the UID command.
At the end of the module, there is a test section that contains a more extensive
example of usage.
.. seealso::
Documents describing the protocol, and sources and binaries for servers
implementing it, can all be found at the University of Washington's *IMAP
Information Center* (http://www.washington.edu/imap/).
IMAP4 Objects
-------------
All IMAP4rev1 commands are represented by methods of the same name, either
upper-case or lower-case.
All arguments to commands are converted to strings, except for ``AUTHENTICATE``,
and the last argument to ``APPEND`` which is passed as an IMAP4 literal. If
necessary (the string contains IMAP4 protocol-sensitive characters and isn't
enclosed with either parentheses or double quotes) each string is quoted.
However, the {password} argument to the ``LOGIN`` command is always quoted. If
you want to avoid having an argument string quoted (eg: the {flags} argument to
``STORE``) then enclose the string in parentheses (eg: ``r'(\Deleted)'``).
Each command returns a tuple: ``(type, [data, ...])`` where {type} is usually
``'OK'`` or ``'NO'``, and {data} is either the text from the command response,
or mandated results from the command. Each {data} is either a string, or a
tuple. If a tuple, then the first part is the header of the response, and the
second part contains the data (ie: 'literal' value).
The {message_set} options to commands below is a string specifying one or more
messages to be acted upon. It may be a simple message number (``'1'``), a range
of message numbers (``'2:4'``), or a group of non-contiguous ranges separated by
commas (``'1:3,6:9'``). A range can contain an asterisk to indicate an infinite
upper bound (``'3:*'``).
An IMAP4 instance has the following methods:
IMAP4.append(mailbox, flags, date_time, message)~
Append {message} to named mailbox.
IMAP4.authenticate(mechanism, authobject)~
Authenticate command --- requires response processing.
{mechanism} specifies which authentication mechanism is to be used - it should
appear in the instance variable ``capabilities`` in the form ``AUTH=mechanism``.
{authobject} must be a callable object:: >
data = authobject(response)
<
It will be called to process server continuation responses. It should return
``data`` that will be encoded and sent to server. It should return ``None`` if
the client abort response ``*`` should be sent instead.
IMAP4.check()~
Checkpoint mailbox on server.
IMAP4.close()~
Close currently selected mailbox. Deleted messages are removed from writable
mailbox. This is the recommended command before ``LOGOUT``.
IMAP4.copy(message_set, new_mailbox)~
Copy {message_set} messages onto end of {new_mailbox}.
IMAP4.create(mailbox)~
Create new mailbox named {mailbox}.
IMAP4.delete(mailbox)~
Delete old mailbox named {mailbox}.
IMAP4.deleteacl(mailbox, who)~
Delete the ACLs (remove any rights) set for who on mailbox.
.. versionadded:: 2.4
IMAP4.expunge()~
Permanently remove deleted items from selected mailbox. Generates an ``EXPUNGE``
response for each deleted message. Returned data contains a list of ``EXPUNGE``
message numbers in order received.
IMAP4.fetch(message_set, message_parts)~
Fetch (parts of) messages. {message_parts} should be a string of message part
names enclosed within parentheses, eg: ``"(UID BODY[TEXT])"``. Returned data
are tuples of message part envelope and data.
IMAP4.getacl(mailbox)~
Get the ``ACL``\ s for {mailbox}. The method is non-standard, but is supported
by the ``Cyrus`` server.
IMAP4.getannotation(mailbox, entry, attribute)~
Retrieve the specified ``ANNOTATION``\ s for {mailbox}. The method is
non-standard, but is supported by the ``Cyrus`` server.
.. versionadded:: 2.5
IMAP4.getquota(root)~
Get the ``quota`` {root}'s resource usage and limits. This method is part of the
IMAP4 QUOTA extension defined in rfc2087.
.. versionadded:: 2.3
IMAP4.getquotaroot(mailbox)~
Get the list of ``quota`` ``roots`` for the named {mailbox}. This method is part
of the IMAP4 QUOTA extension defined in rfc2087.
.. versionadded:: 2.3
IMAP4.list([directory[, pattern]])~
List mailbox names in {directory} matching {pattern}. {directory} defaults to
the top-level mail folder, and {pattern} defaults to match anything. Returned
data contains a list of ``LIST`` responses.
IMAP4.login(user, password)~
Identify the client using a plaintext password. The {password} will be quoted.
IMAP4.login_cram_md5(user, password)~
Force use of ``CRAM-MD5`` authentication when identifying the client to protect
the password. Will only work if the server ``CAPABILITY`` response includes the
phrase ``AUTH=CRAM-MD5``.
.. versionadded:: 2.3
IMAP4.logout()~
Shutdown connection to server. Returns server ``BYE`` response.
IMAP4.lsub([directory[, pattern]])~
List subscribed mailbox names in directory matching pattern. {directory}
defaults to the top level directory and {pattern} defaults to match any mailbox.
Returned data are tuples of message part envelope and data.
IMAP4.myrights(mailbox)~
Show my ACLs for a mailbox (i.e. the rights that I have on mailbox).
.. versionadded:: 2.4
IMAP4.namespace()~
Returns IMAP namespaces as defined in RFC2342.
.. versionadded:: 2.3
IMAP4.noop()~
Send ``NOOP`` to server.
IMAP4.open(host, port)~
Opens socket to {port} at {host}. The connection objects established by this
method will be used in the ``read``, ``readline``, ``send``, and ``shutdown``
methods. You may override this method.
IMAP4.partial(message_num, message_part, start, length)~
Fetch truncated part of a message. Returned data is a tuple of message part
envelope and data.
IMAP4.proxyauth(user)~
Assume authentication as {user}. Allows an authorised administrator to proxy
into any user's mailbox.
.. versionadded:: 2.3
IMAP4.read(size)~
Reads {size} bytes from the remote server. You may override this method.
IMAP4.readline()~
Reads one line from the remote server. You may override this method.
IMAP4.recent()~
Prompt server for an update. Returned data is ``None`` if no new messages, else
value of ``RECENT`` response.
IMAP4.rename(oldmailbox, newmailbox)~
Rename mailbox named {oldmailbox} to {newmailbox}.
IMAP4.response(code)~
Return data for response {code} if received, or ``None``. Returns the given
code, instead of the usual type.
IMAP4.search(charset, criterion[, ...])~
Search mailbox for matching messages. {charset} may be ``None``, in which case
no ``CHARSET`` will be specified in the request to the server. The IMAP
protocol requires that at least one criterion be specified; an exception will be
raised when the server returns an error.
Example:: >
# M is a connected IMAP4 instance...
typ, msgnums = M.search(None, 'FROM', '"LDJ"')
# or:
typ, msgnums = M.search(None, '(FROM "LDJ")')
<
IMAP4.select([mailbox[, readonly]])~
Select a mailbox. Returned data is the count of messages in {mailbox}
(``EXISTS`` response). The default {mailbox} is ``'INBOX'``. If the {readonly}
flag is set, modifications to the mailbox are not allowed.
IMAP4.send(data)~
Sends ``data`` to the remote server. You may override this method.
IMAP4.setacl(mailbox, who, what)~
Set an ``ACL`` for {mailbox}. The method is non-standard, but is supported by
the ``Cyrus`` server.
IMAP4.setannotation(mailbox, entry, attribute[, ...])~
Set ``ANNOTATION``\ s for {mailbox}. The method is non-standard, but is
supported by the ``Cyrus`` server.
.. versionadded:: 2.5
IMAP4.setquota(root, limits)~
Set the ``quota`` {root}'s resource {limits}. This method is part of the IMAP4
QUOTA extension defined in rfc2087.
.. versionadded:: 2.3
IMAP4.shutdown()~
Close connection established in ``open``. You may override this method.
IMAP4.socket()~
Returns socket instance used to connect to server.
IMAP4.sort(sort_criteria, charset, search_criterion[, ...])~
The ``sort`` command is a variant of ``search`` with sorting semantics for the
results. Returned data contains a space separated list of matching message
numbers.
Sort has two arguments before the {search_criterion} argument(s); a
parenthesized list of {sort_criteria}, and the searching {charset}. Note that
unlike ``search``, the searching {charset} argument is mandatory. There is also
a ``uid sort`` command which corresponds to ``sort`` the way that ``uid search``
corresponds to ``search``. The ``sort`` command first searches the mailbox for
messages that match the given searching criteria using the charset argument for
the interpretation of strings in the searching criteria. It then returns the
numbers of matching messages.
This is an ``IMAP4rev1`` extension command.
IMAP4.status(mailbox, names)~
Request named status conditions for {mailbox}.
IMAP4.store(message_set, command, flag_list)~
Alters flag dispositions for messages in mailbox. {command} is specified by
section 6.4.6 of 2060 as being one of "FLAGS", "+FLAGS", or "-FLAGS",
optionally with a suffix of ".SILENT".
For example, to set the delete flag on all messages:: >
typ, data = M.search(None, 'ALL')
for num in data[0].split():
M.store(num, '+FLAGS', '\\Deleted')
M.expunge()
<
IMAP4.subscribe(mailbox)~
Subscribe to new mailbox.
IMAP4.thread(threading_algorithm, charset, search_criterion[, ...])~
The ``thread`` command is a variant of ``search`` with threading semantics for
the results. Returned data contains a space separated list of thread members.
Thread members consist of zero or more messages numbers, delimited by spaces,
indicating successive parent and child.
Thread has two arguments before the {search_criterion} argument(s); a
{threading_algorithm}, and the searching {charset}. Note that unlike
``search``, the searching {charset} argument is mandatory. There is also a
``uid thread`` command which corresponds to ``thread`` the way that ``uid
search`` corresponds to ``search``. The ``thread`` command first searches the
mailbox for messages that match the given searching criteria using the charset
argument for the interpretation of strings in the searching criteria. It then
returns the matching messages threaded according to the specified threading
algorithm.
This is an ``IMAP4rev1`` extension command.
.. versionadded:: 2.4
IMAP4.uid(command, arg[, ...])~
Execute command args with messages identified by UID, rather than message
number. Returns response appropriate to command. At least one argument must be
supplied; if none are provided, the server will return an error and an exception
will be raised.
IMAP4.unsubscribe(mailbox)~
Unsubscribe from old mailbox.
IMAP4.xatom(name[, arg[, ...]])~
Allow simple extension commands notified by server in ``CAPABILITY`` response.
Instances of IMAP4_SSL have just one additional method:
IMAP4_SSL.ssl()~
Returns SSLObject instance used for the secure connection with the server.
The following attributes are defined on instances of IMAP4:
IMAP4.PROTOCOL_VERSION~
The most recent supported protocol in the ``CAPABILITY`` response from the
server.
IMAP4.debug~
Integer value to control debugging output. The initialize value is taken from
the module variable ``Debug``. Values greater than three trace each command.
IMAP4 Example
-------------
Here is a minimal example (without error checking) that opens a mailbox and
retrieves and prints all messages:: >
import getpass, imaplib
M = imaplib.IMAP4()
M.login(getpass.getuser(), getpass.getpass())
M.select()
typ, data = M.search(None, 'ALL')
for num in data[0].split():
typ, data = M.fetch(num, '(RFC822)')
print 'Message %s\n%s\n' % (num, data[0][1])
M.close()
M.logout()
==============================================================================
*py2stdlib-imgfile*
imgfile~
:platform: IRIX
:synopsis: Support for SGI imglib files.
:deprecated:
2.6~
The imgfile (|py2stdlib-imgfile|) module has been deprecated for removal in Python 3.0.
The imgfile (|py2stdlib-imgfile|) module allows Python programs to access SGI imglib image
files (also known as .rgb files). The module is far from complete, but
is provided anyway since the functionality that there is enough in some cases.
Currently, colormap files are not supported.
The module defines the following variables and functions:
error~
This exception is raised on all errors, such as unsupported file type, etc.
getsizes(file)~
This function returns a tuple ``(x, y, z)`` where {x} and {y} are the size of
the image in pixels and {z} is the number of bytes per pixel. Only 3 byte RGB
pixels and 1 byte greyscale pixels are currently supported.
read(file)~
This function reads and decodes the image on the specified file, and returns it
as a Python string. The string has either 1 byte greyscale pixels or 4 byte RGBA
pixels. The bottom left pixel is the first in the string. This format is
suitable to pass to gl.lrectwrite, for instance.
readscaled(file, x, y, filter[, blur])~
This function is identical to read but it returns an image that is scaled to the
given {x} and {y} sizes. If the {filter} and {blur} parameters are omitted
scaling is done by simply dropping or duplicating pixels, so the result will be
less than perfect, especially for computer-generated images.
Alternatively, you can specify a filter to use to smooth the image after
scaling. The filter forms supported are ``'impulse'``, ``'box'``,
``'triangle'``, ``'quadratic'`` and ``'gaussian'``. If a filter is specified
{blur} is an optional parameter specifying the blurriness of the filter. It
defaults to ``1.0``.
readscaled makes no attempt to keep the aspect ratio correct, so that is
the users' responsibility.
ttob(flag)~
This function sets a global flag which defines whether the scan lines of the
image are read or written from bottom to top (flag is zero, compatible with SGI
GL) or from top to bottom(flag is one, compatible with X). The default is zero.
write(file, data, x, y, z)~
This function writes the RGB or greyscale data in {data} to image file {file}.
{x} and {y} give the size of the image, {z} is 1 for 1 byte greyscale images or
3 for RGB images (which are stored as 4 byte values of which only the lower
three bytes are used). These are the formats returned by gl.lrectread.
==============================================================================
*py2stdlib-imghdr*
imghdr~
:synopsis: Determine the type of image contained in a file or byte stream.
The imghdr (|py2stdlib-imghdr|) module determines the type of image contained in a file or
byte stream.
The imghdr (|py2stdlib-imghdr|) module defines the following function:
what(filename[, h])~
Tests the image data contained in the file named by {filename}, and returns a
string describing the image type. If optional {h} is provided, the {filename}
is ignored and {h} is assumed to contain the byte stream to test.
The following image types are recognized, as listed below with the return value
from what:
+------------+-----------------------------------+
| Value | Image format |
+============+===================================+
| ``'rgb'`` | SGI ImgLib Files |
+------------+-----------------------------------+
| ``'gif'`` | GIF 87a and 89a Files |
+------------+-----------------------------------+
| ``'pbm'`` | Portable Bitmap Files |
+------------+-----------------------------------+
| ``'pgm'`` | Portable Graymap Files |
+------------+-----------------------------------+
| ``'ppm'`` | Portable Pixmap Files |
+------------+-----------------------------------+
| ``'tiff'`` | TIFF Files |
+------------+-----------------------------------+
| ``'rast'`` | Sun Raster Files |
+------------+-----------------------------------+
| ``'xbm'`` | X Bitmap Files |
+------------+-----------------------------------+
| ``'jpeg'`` | JPEG data in JFIF or Exif formats |
+------------+-----------------------------------+
| ``'bmp'`` | BMP files |
+------------+-----------------------------------+
| ``'png'`` | Portable Network Graphics |
+------------+-----------------------------------+
.. versionadded:: 2.5
Exif detection.
You can extend the list of file types imghdr (|py2stdlib-imghdr|) can recognize by appending
to this variable:
tests~
A list of functions performing the individual tests. Each function takes two
arguments: the byte-stream and an open file-like object. When what is
called with a byte-stream, the file-like object will be ``None``.
The test function should return a string describing the image type if the test
succeeded, or ``None`` if it failed.
Example:: >
>>> import imghdr
>>> imghdr.what('/tmp/bass.gif')
'gif'
==============================================================================
*py2stdlib-imp*
imp~
:synopsis: Access the implementation of the import statement.
.. index:: statement: import
This module provides an interface to the mechanisms used to implement the
import statement. It defines the following constants and functions:
get_magic()~
.. index:: pair: file; byte-code
Return the magic string value used to recognize byte-compiled code files
(.pyc files). (This value may be different for each Python version.)
get_suffixes()~
Return a list of 3-element tuples, each describing a particular type of
module. Each triple has the form ``(suffix, mode, type)``, where {suffix} is
a string to be appended to the module name to form the filename to search
for, {mode} is the mode string to pass to the built-in open function
to open the file (this can be ``'r'`` for text files or ``'rb'`` for binary
files), and {type} is the file type, which has one of the values
PY_SOURCE, PY_COMPILED, or C_EXTENSION, described
below.
find_module(name[, path])~
Try to find the module {name}. If {path} is omitted or ``None``, the list of
directory names given by ``sys.path`` is searched, but first a few special
places are searched: the function tries to find a built-in module with the
given name (C_BUILTIN), then a frozen module (PY_FROZEN),
and on some systems some other places are looked in as well (on Windows, it
looks in the registry which may point to a specific file).
Otherwise, {path} must be a list of directory names; each directory is
searched for files with any of the suffixes returned by get_suffixes
above. Invalid names in the list are silently ignored (but all list items
must be strings).
If search is successful, the return value is a 3-element tuple ``(file,
pathname, description)``:
{file} is an open file object positioned at the beginning, {pathname} is the
pathname of the file found, and {description} is a 3-element tuple as
contained in the list returned by get_suffixes describing the kind of
module found.
If the module does not live in a file, the returned {file} is ``None``,
{pathname} is the empty string, and the {description} tuple contains empty
strings for its suffix and mode; the module type is indicated as given in
parentheses above. If the search is unsuccessful, ImportError is
raised. Other exceptions indicate problems with the arguments or
environment.
If the module is a package, {file} is ``None``, {pathname} is the package
path and the last item in the {description} tuple is PKG_DIRECTORY.
This function does not handle hierarchical module names (names containing
dots). In order to find {P}.{M}, that is, submodule {M} of package {P}, use
find_module and load_module to find and load package {P}, and
then use find_module with the {path} argument set to ``P.__path__``.
When {P} itself has a dotted name, apply this recipe recursively.
load_module(name, file, pathname, description)~
.. index:: builtin: reload
Load a module that was previously found by find_module (or by an
otherwise conducted search yielding compatible results). This function does
more than importing the module: if the module was already imported, it is
equivalent to a reload! The {name} argument indicates the full
module name (including the package name, if this is a submodule of a
package). The {file} argument is an open file, and {pathname} is the
corresponding file name; these can be ``None`` and ``''``, respectively, when
the module is a package or not being loaded from a file. The {description}
argument is a tuple, as would be returned by get_suffixes, describing
what kind of module must be loaded.
If the load is successful, the return value is the module object; otherwise,
an exception (usually ImportError) is raised.
{Important:}{ the caller is responsible for closing the }file* argument, if
it was not ``None``, even when an exception is raised. This is best done
using a try ... finally statement.
new_module(name)~
Return a new empty module object called {name}. This object is {not} inserted
in ``sys.modules``.
lock_held()~
Return ``True`` if the import lock is currently held, else ``False``. On
platforms without threads, always return ``False``.
On platforms with threads, a thread executing an import holds an internal lock
until the import is complete. This lock blocks other threads from doing an
import until the original import completes, which in turn prevents other threads
from seeing incomplete module objects constructed by the original thread while
in the process of completing its import (and the imports, if any, triggered by
that).
acquire_lock()~
Acquire the interpreter's import lock for the current thread. This lock should
be used by import hooks to ensure thread-safety when importing modules. On
platforms without threads, this function does nothing.
Once a thread has acquired the import lock, the same thread may acquire it
again without blocking; the thread must release it once for each time it has
acquired it.
On platforms without threads, this function does nothing.
.. versionadded:: 2.3
release_lock()~
Release the interpreter's import lock. On platforms without threads, this
function does nothing.
.. versionadded:: 2.3
The following constants with integer values, defined in this module, are used to
indicate the search result of find_module.
PY_SOURCE~
The module was found as a source file.
PY_COMPILED~
The module was found as a compiled code object file.
C_EXTENSION~
The module was found as dynamically loadable shared library.
PKG_DIRECTORY~
The module was found as a package directory.
C_BUILTIN~
The module was found as a built-in module.
PY_FROZEN~
The module was found as a frozen module (see init_frozen).
The following constant and functions are obsolete; their functionality is
available through find_module or load_module. They are kept
around for backward compatibility:
SEARCH_ERROR~
Unused.
init_builtin(name)~
Initialize the built-in module called {name} and return its module object along
with storing it in ``sys.modules``. If the module was already initialized, it
will be initialized {again}. Re-initialization involves the copying of the
built-in module's ``__dict__`` from the cached module over the module's entry in
``sys.modules``. If there is no built-in module called {name}, ``None`` is
returned.
init_frozen(name)~
Initialize the frozen module called {name} and return its module object. If
the module was already initialized, it will be initialized {again}. If there
is no frozen module called {name}, ``None`` is returned. (Frozen modules are
modules written in Python whose compiled byte-code object is incorporated
into a custom-built Python interpreter by Python's freeze
utility. See Tools/freeze/ for now.)
is_builtin(name)~
Return ``1`` if there is a built-in module called {name} which can be
initialized again. Return ``-1`` if there is a built-in module called {name}
which cannot be initialized again (see init_builtin). Return ``0`` if
there is no built-in module called {name}.
is_frozen(name)~
Return ``True`` if there is a frozen module (see init_frozen) called
{name}, or ``False`` if there is no such module.
load_compiled(name, pathname, [file])~
.. index:: pair: file; byte-code
Load and initialize a module implemented as a byte-compiled code file and return
its module object. If the module was already initialized, it will be
initialized {again}. The {name} argument is used to create or access a module
object. The {pathname} argument points to the byte-compiled code file. The
{file} argument is the byte-compiled code file, open for reading in binary mode,
from the beginning. It must currently be a real file object, not a user-defined
class emulating a file.
load_dynamic(name, pathname[, file])~
Load and initialize a module implemented as a dynamically loadable shared
library and return its module object. If the module was already initialized, it
will be initialized {again}. Re-initialization involves copying the ``__dict__``
attribute of the cached instance of the module over the value used in the module
cached in ``sys.modules``. The {pathname} argument must point to the shared
library. The {name} argument is used to construct the name of the
initialization function: an external C function called ``initname()`` in the
shared library is called. The optional {file} argument is ignored. (Note:
using shared libraries is highly system dependent, and not all systems support
it.)
load_source(name, pathname[, file])~
Load and initialize a module implemented as a Python source file and return its
module object. If the module was already initialized, it will be initialized
{again}. The {name} argument is used to create or access a module object. The
{pathname} argument points to the source file. The {file} argument is the
source file, open for reading as text, from the beginning. It must currently be
a real file object, not a user-defined class emulating a file. Note that if a
properly matching byte-compiled file (with suffix .pyc or .pyo)
exists, it will be used instead of parsing the given source file.
NullImporter(path_string)~
The NullImporter type is a 302 import hook that handles
non-directory path strings by failing to find any modules. Calling this type
with an existing directory or empty string raises ImportError.
Otherwise, a NullImporter instance is returned.
Python adds instances of this type to ``sys.path_importer_cache`` for any path
entries that are not directories and are not handled by any other path hooks on
``sys.path_hooks``. Instances have only one method:
NullImporter.find_module(fullname [, path])~
This method always returns ``None``, indicating that the requested module could
not be found.
.. versionadded:: 2.5
Examples
--------
The following function emulates what was the standard import statement up to
Python 1.4 (no hierarchical module names). (This {implementation} wouldn't work
in that version, since find_module has been extended and
load_module has been added in 1.4.) :: >
import imp
import sys
def __import__(name, globals=None, locals=None, fromlist=None):
# Fast path: see if the module has already been imported.
try:
return sys.modules[name]
except KeyError:
pass
# If any of the following calls raises an exception,
# there's a problem we can't handle -- let the caller handle it.
fp, pathname, description = imp.find_module(name)
try:
return imp.load_module(name, fp, pathname, description)
finally:
# Since we may exit via an exception, close fp explicitly.
if fp:
fp.close()
<
.. index::
builtin: reload
module: knee
A more complete example that implements hierarchical module names and includes a
reload function can be found in the module knee. The knee
module can be found in Demo/imputil/ in the Python source distribution.
==============================================================================
*py2stdlib-importlib*
importlib~
:synopsis: Convenience wrappers for __import__
.. versionadded:: 2.7
This module is a minor subset of what is available in the more full-featured
package of the same name from Python 3.1 that provides a complete
implementation of import. What is here has been provided to
help ease in transitioning from 2.7 to 3.1.
import_module(name, package=None)~
Import a module. The {name} argument specifies what module to
import in absolute or relative terms
(e.g. either ``pkg.mod`` or ``..mod``). If the name is
specified in relative terms, then the {package} argument must be
specified to the package which is to act as the anchor for resolving the
package name (e.g. ``import_module('..mod', 'pkg.subpkg')`` will import
``pkg.mod``). The specified module will be inserted into
sys.modules and returned.
==============================================================================
*py2stdlib-imputil*
imputil~
:synopsis: Manage and augment the import process.
:deprecated:
2.6~
The imputil (|py2stdlib-imputil|) module has been removed in Python 3.0.
.. index:: statement: import
This module provides a very handy and useful mechanism for custom
import hooks. Compared to the older ihooks module,
imputil (|py2stdlib-imputil|) takes a dramatically simpler and more straight-forward
approach to custom import functions.
ImportManager([fs_imp])~
Manage the import process.
ImportManager.install([namespace])~
Install this ImportManager into the specified namespace.
ImportManager.uninstall()~
Restore the previous import mechanism.
ImportManager.add_suffix(suffix, importFunc)~
Undocumented.
Importer()~
Base class for replacing standard import functions.
Importer.import_top(name)~
Import a top-level module.
Importer.get_code(parent, modname, fqname)~
Find and retrieve the code for the given module.
{parent} specifies a parent module to define a context for importing.
It may be ``None``, indicating no particular context for the search.
{modname} specifies a single module (not dotted) within the parent.
{fqname} specifies the fully-qualified module name. This is a
(potentially) dotted name from the "root" of the module namespace
down to the modname.
If there is no parent, then modname==fqname.
This method should return ``None``, or a 3-tuple.
* If the module was not found, then ``None`` should be returned.
* The first item of the 2- or 3-tuple should be the integer 0 or 1,
specifying whether the module that was found is a package or not.
* The second item is the code object for the module (it will be
executed within the new module's namespace). This item can also
be a fully-loaded module object (e.g. loaded from a shared lib).
* The third item is a dictionary of name/value pairs that will be
inserted into new module before the code object is executed. This
is provided in case the module's code expects certain values (such
as where the module was found). When the second item is a module
object, then these names/values will be inserted {after} the module
has been loaded/initialized.
BuiltinImporter()~
Emulate the import mechanism for built-in and frozen modules. This is a
sub-class of the Importer class.
BuiltinImporter.get_code(parent, modname, fqname)~
Undocumented.
py_suffix_importer(filename, finfo, fqname)~
Undocumented.
DynLoadSuffixImporter([desc])~
Undocumented.
DynLoadSuffixImporter.import_file(filename, finfo, fqname)~
Undocumented.
Examples
--------
This is a re-implementation of hierarchical module import.
This code is intended to be read, not executed. However, it does work
-- all you need to do to enable it is "import knee".
(The name is a pun on the clunkier predecessor of this module, "ni".)
:: >
import sys, imp, __builtin__
# Replacement for __import__()
def import_hook(name, globals=None, locals=None, fromlist=None):
parent = determine_parent(globals)
q, tail = find_head_package(parent, name)
m = load_tail(q, tail)
if not fromlist:
return q
if hasattr(m, "__path__"):
ensure_fromlist(m, fromlist)
return m
def determine_parent(globals):
if not globals or not globals.has_key("__name__"):
return None
pname = globals['__name__']
if globals.has_key("__path__"):
parent = sys.modules[pname]
assert globals is parent.__dict__
return parent
if '.' in pname:
i = pname.rfind('.')
pname = pname[:i]
parent = sys.modules[pname]
assert parent.__name__ == pname
return parent
return None
def find_head_package(parent, name):
if '.' in name:
i = name.find('.')
head = name[:i]
tail = name[i+1:]
else:
head = name
tail = ""
if parent:
qname = "%s.%s" % (parent.__name__, head)
else:
qname = head
q = import_module(head, qname, parent)
if q: return q, tail
if parent:
qname = head
parent = None
q = import_module(head, qname, parent)
if q: return q, tail
raise ImportError("No module named " + qname)
def load_tail(q, tail):
m = q
while tail:
i = tail.find('.')
if i < 0: i = len(tail)
head, tail = tail[:i], tail[i+1:]
mname = "%s.%s" % (m.__name__, head)
m = import_module(head, mname, m)
if not m:
raise ImportError("No module named " + mname)
return m
def ensure_fromlist(m, fromlist, recursive=0):
for sub in fromlist:
if sub == "*":
if not recursive:
try:
all = m.__all__
except AttributeError:
pass
else:
ensure_fromlist(m, all, 1)
continue
if sub != "*" and not hasattr(m, sub):
subname = "%s.%s" % (m.__name__, sub)
submod = import_module(sub, subname, m)
if not submod:
raise ImportError("No module named " + subname)
def import_module(partname, fqname, parent):
try:
return sys.modules[fqname]
except KeyError:
pass
try:
fp, pathname, stuff = imp.find_module(partname,
parent and parent.__path__)
except ImportError:
return None
try:
m = imp.load_module(fqname, fp, pathname, stuff)
finally:
if fp: fp.close()
if parent:
setattr(parent, partname, m)
return m
# Replacement for reload()
def reload_hook(module):
name = module.__name__
if '.' not in name:
return import_module(name, name, None)
i = name.rfind('.')
pname = name[:i]
parent = sys.modules[pname]
return import_module(name[i+1:], name, parent)
# Save the original hooks
original_import = __builtin__.__import__
original_reload = __builtin__.reload
# Now install our hooks
__builtin__.__import__ = import_hook
__builtin__.reload = reload_hook
<
.. index::
module: knee
Also see the importers module (which can be found
in Demo/imputil/ in the Python source distribution) for additional
examples.
==============================================================================
*py2stdlib-inspect*
inspect~
:synopsis: Extract information and source code from live objects.
.. versionadded:: 2.1
The inspect (|py2stdlib-inspect|) module provides several useful functions to help get
information about live objects such as modules, classes, methods, functions,
tracebacks, frame objects, and code objects. For example, it can help you
examine the contents of a class, retrieve the source code of a method, extract
and format the argument list for a function, or get all the information you need
to display a detailed traceback.
There are four main kinds of services provided by this module: type checking,
getting source code, inspecting classes and functions, and examining the
interpreter stack.
Types and members
-----------------
The getmembers function retrieves the members of an object such as a
class or module. The sixteen functions whose names begin with "is" are mainly
provided as convenient choices for the second argument to getmembers.
They also help you determine when you can expect to find the following special
attributes:
+-----------+-----------------+---------------------------+-------+
| Type | Attribute | Description | Notes |
+===========+=================+===========================+=======+
| module | __doc__ | documentation string | |
+-----------+-----------------+---------------------------+-------+
| | __file__ | filename (missing for | |
| | | built-in modules) | |
+-----------+-----------------+---------------------------+-------+
| class | __doc__ | documentation string | |
+-----------+-----------------+---------------------------+-------+
| | __module__ | name of module in which | |
| | | this class was defined | |
+-----------+-----------------+---------------------------+-------+
| method | __doc__ | documentation string | |
+-----------+-----------------+---------------------------+-------+
| | __name__ | name with which this | |
| | | method was defined | |
+-----------+-----------------+---------------------------+-------+
| | im_class | class object that asked | \(1) |
| | | for this method | |
+-----------+-----------------+---------------------------+-------+
| | im_func or | function object | |
| | __func__ | containing implementation | |
| | | of method | |
+-----------+-----------------+---------------------------+-------+
| | im_self or | instance to which this | |
| | __self__ | method is bound, or | |
| | | ``None`` | |
+-----------+-----------------+---------------------------+-------+
| function | __doc__ | documentation string | |
+-----------+-----------------+---------------------------+-------+
| | __name__ | name with which this | |
| | | function was defined | |
+-----------+-----------------+---------------------------+-------+
| | func_code | code object containing | |
| | | compiled function | |
| | | bytecode | |
+-----------+-----------------+---------------------------+-------+
| | func_defaults | tuple of any default | |
| | | values for arguments | |
+-----------+-----------------+---------------------------+-------+
| | func_doc | (same as __doc__) | |
+-----------+-----------------+---------------------------+-------+
| | func_globals | global namespace in which | |
| | | this function was defined | |
+-----------+-----------------+---------------------------+-------+
| | func_name | (same as __name__) | |
+-----------+-----------------+---------------------------+-------+
| generator | __iter__ | defined to support | |
| | | iteration over container | |
+-----------+-----------------+---------------------------+-------+
| | close | raises new GeneratorExit | |
| | | exception inside the | |
| | | generator to terminate | |
| | | the iteration | |
+-----------+-----------------+---------------------------+-------+
| | gi_code | code object | |
+-----------+-----------------+---------------------------+-------+
| | gi_frame | frame object or possibly | |
| | | None once the generator | |
| | | has been exhausted | |
+-----------+-----------------+---------------------------+-------+
| | gi_running | set to 1 when generator | |
| | | is executing, 0 otherwise | |
+-----------+-----------------+---------------------------+-------+
| | next | return the next item from | |
| | | the container | |
+-----------+-----------------+---------------------------+-------+
| | send | resumes the generator and | |
| | | "sends" a value that | |
| | | becomes the result of the | |
| | | current yield-expression | |
+-----------+-----------------+---------------------------+-------+
| | throw | used to raise an | |
| | | exception inside the | |
| | | generator | |
+-----------+-----------------+---------------------------+-------+
| traceback | tb_frame | frame object at this | |
| | | level | |
+-----------+-----------------+---------------------------+-------+
| | tb_lasti | index of last attempted | |
| | | instruction in bytecode | |
+-----------+-----------------+---------------------------+-------+
| | tb_lineno | current line number in | |
| | | Python source code | |
+-----------+-----------------+---------------------------+-------+
| | tb_next | next inner traceback | |
| | | object (called by this | |
| | | level) | |
+-----------+-----------------+---------------------------+-------+
| frame | f_back | next outer frame object | |
| | | (this frame's caller) | |
+-----------+-----------------+---------------------------+-------+
| | f_builtins | builtins namespace seen | |
| | | by this frame | |
+-----------+-----------------+---------------------------+-------+
| | f_code | code object being | |
| | | executed in this frame | |
+-----------+-----------------+---------------------------+-------+
| | f_exc_traceback | traceback if raised in | |
| | | this frame, or ``None`` | |
+-----------+-----------------+---------------------------+-------+
| | f_exc_type | exception type if raised | |
| | | in this frame, or | |
| | | ``None`` | |
+-----------+-----------------+---------------------------+-------+
| | f_exc_value | exception value if raised | |
| | | in this frame, or | |
| | | ``None`` | |
+-----------+-----------------+---------------------------+-------+
| | f_globals | global namespace seen by | |
| | | this frame | |
+-----------+-----------------+---------------------------+-------+
| | f_lasti | index of last attempted | |
| | | instruction in bytecode | |
+-----------+-----------------+---------------------------+-------+
| | f_lineno | current line number in | |
| | | Python source code | |
+-----------+-----------------+---------------------------+-------+
| | f_locals | local namespace seen by | |
| | | this frame | |
+-----------+-----------------+---------------------------+-------+
| | f_restricted | 0 or 1 if frame is in | |
| | | restricted execution mode | |
+-----------+-----------------+---------------------------+-------+
| | f_trace | tracing function for this | |
| | | frame, or ``None`` | |
+-----------+-----------------+---------------------------+-------+
| code | co_argcount | number of arguments (not | |
| | | including \{ or \}\* | |
| | | args) | |
+-----------+-----------------+---------------------------+-------+
| | co_code | string of raw compiled | |
| | | bytecode | |
+-----------+-----------------+---------------------------+-------+
| | co_consts | tuple of constants used | |
| | | in the bytecode | |
+-----------+-----------------+---------------------------+-------+
| | co_filename | name of file in which | |
| | | this code object was | |
| | | created | |
+-----------+-----------------+---------------------------+-------+
| | co_firstlineno | number of first line in | |
| | | Python source code | |
+-----------+-----------------+---------------------------+-------+
| | co_flags | bitmap: 1=optimized ``|`` | |
| | | 2=newlocals ``|`` 4=\*arg | |
| | | ``|`` 8=\{\}arg | |
+-----------+-----------------+---------------------------+-------+
| | co_lnotab | encoded mapping of line | |
| | | numbers to bytecode | |
| | | indices | |
+-----------+-----------------+---------------------------+-------+
| | co_name | name with which this code | |
| | | object was defined | |
+-----------+-----------------+---------------------------+-------+
| | co_names | tuple of names of local | |
| | | variables | |
+-----------+-----------------+---------------------------+-------+
| | co_nlocals | number of local variables | |
+-----------+-----------------+---------------------------+-------+
| | co_stacksize | virtual machine stack | |
| | | space required | |
+-----------+-----------------+---------------------------+-------+
| | co_varnames | tuple of names of | |
| | | arguments and local | |
| | | variables | |
+-----------+-----------------+---------------------------+-------+
| builtin | __doc__ | documentation string | |
+-----------+-----------------+---------------------------+-------+
| | __name__ | original name of this | |
| | | function or method | |
+-----------+-----------------+---------------------------+-------+
| | __self__ | instance to which a | |
| | | method is bound, or | |
| | | ``None`` | |
+-----------+-----------------+---------------------------+-------+
Note:
(1)
.. versionchanged:: 2.2
im_class used to refer to the class that defined the method.
getmembers(object[, predicate])~
Return all the members of an object in a list of (name, value) pairs sorted by
name. If the optional {predicate} argument is supplied, only members for which
the predicate returns a true value are included.
.. note:: >
getmembers does not return metaclass attributes when the argument
is a class (this behavior is inherited from the dir function).
<
getmoduleinfo(path)~
Return a tuple of values that describe how Python will interpret the file
identified by {path} if it is a module, or ``None`` if it would not be
identified as a module. The return tuple is ``(name, suffix, mode, mtype)``,
where {name} is the name of the module without the name of any enclosing
package, {suffix} is the trailing part of the file name (which may not be a
dot-delimited extension), {mode} is the open mode that would be used
(``'r'`` or ``'rb'``), and {mtype} is an integer giving the type of the
module. {mtype} will have a value which can be compared to the constants
defined in the imp (|py2stdlib-imp|) module; see the documentation for that module for
more information on module types.
.. versionchanged:: 2.6
Returns a named tuple ``ModuleInfo(name, suffix, mode,
module_type)``.
getmodulename(path)~
Return the name of the module named by the file {path}, without including the
names of enclosing packages. This uses the same algorithm as the interpreter
uses when searching for modules. If the name cannot be matched according to the
interpreter's rules, ``None`` is returned.
ismodule(object)~
Return true if the object is a module.
isclass(object)~
Return true if the object is a class.
ismethod(object)~
Return true if the object is a method.
isfunction(object)~
Return true if the object is a Python function or unnamed (lambda) function.
isgeneratorfunction(object)~
Return true if the object is a Python generator function.
.. versionadded:: 2.6
isgenerator(object)~
Return true if the object is a generator.
.. versionadded:: 2.6
istraceback(object)~
Return true if the object is a traceback.
isframe(object)~
Return true if the object is a frame.
iscode(object)~
Return true if the object is a code.
isbuiltin(object)~
Return true if the object is a built-in function.
isroutine(object)~
Return true if the object is a user-defined or built-in function or method.
isabstract(object)~
Return true if the object is an abstract base class.
.. versionadded:: 2.6
ismethoddescriptor(object)~
Return true if the object is a method descriptor, but not if ismethod
or isclass or isfunction are true.
This is new as of Python 2.2, and, for example, is true of
``int.__add__``. An object passing this test has a __get__ attribute
but not a __set__ attribute, but beyond that the set of attributes
varies. __name__ is usually sensible, and __doc__ often is.
Methods implemented via descriptors that also pass one of the other tests
return false from the ismethoddescriptor test, simply because the
other tests promise more -- you can, e.g., count on having the
im_func attribute (etc) when an object passes ismethod.
isdatadescriptor(object)~
Return true if the object is a data descriptor.
Data descriptors have both a __get__ and a __set__ attribute.
Examples are properties (defined in Python), getsets, and members. The
latter two are defined in C and there are more specific tests available for
those types, which is robust across Python implementations. Typically, data
descriptors will also have __name__ and __doc__ attributes
(properties, getsets, and members have both of these attributes), but this is
not guaranteed.
.. versionadded:: 2.3
isgetsetdescriptor(object)~
Return true if the object is a getset descriptor.
.. impl-detail:: >
getsets are attributes defined in extension modules via
PyGetSetDef structures. For Python implementations without such
types, this method will always return ``False``.
<
.. versionadded:: 2.5
ismemberdescriptor(object)~
Return true if the object is a member descriptor.
.. impl-detail:: >
Member descriptors are attributes defined in extension modules via
PyMemberDef structures. For Python implementations without such
types, this method will always return ``False``.
<
.. versionadded:: 2.5
Retrieving source code
----------------------
getdoc(object)~
Get the documentation string for an object, cleaned up with cleandoc.
getcomments(object)~
Return in a single string any lines of comments immediately preceding the
object's source code (for a class, function, or method), or at the top of the
Python source file (if the object is a module).
getfile(object)~
Return the name of the (text or binary) file in which an object was defined.
This will fail with a TypeError if the object is a built-in module,
class, or function.
getmodule(object)~
Try to guess which module an object was defined in.
getsourcefile(object)~
Return the name of the Python source file in which an object was defined. This
will fail with a TypeError if the object is a built-in module, class, or
function.
getsourcelines(object)~
Return a list of source lines and starting line number for an object. The
argument may be a module, class, method, function, traceback, frame, or code
object. The source code is returned as a list of the lines corresponding to the
object and the line number indicates where in the original source file the first
line of code was found. An IOError is raised if the source code cannot
be retrieved.
getsource(object)~
Return the text of the source code for an object. The argument may be a module,
class, method, function, traceback, frame, or code object. The source code is
returned as a single string. An IOError is raised if the source code
cannot be retrieved.
cleandoc(doc)~
Clean up indentation from docstrings that are indented to line up with blocks
of code. Any whitespace that can be uniformly removed from the second line
onwards is removed. Also, all tabs are expanded to spaces.
.. versionadded:: 2.6
Classes and functions
---------------------
getclasstree(classes[, unique])~
Arrange the given list of classes into a hierarchy of nested lists. Where a
nested list appears, it contains classes derived from the class whose entry
immediately precedes the list. Each entry is a 2-tuple containing a class and a
tuple of its base classes. If the {unique} argument is true, exactly one entry
appears in the returned structure for each class in the given list. Otherwise,
classes using multiple inheritance and their descendants will appear multiple
times.
getargspec(func)~
Get the names and default values of a Python function's arguments. A tuple of four
things is returned: ``(args, varargs, varkw, defaults)``. {args} is a list of
the argument names (it may contain nested lists). {varargs} and {varkw} are the
names of the ``{`` and ``}{`` arguments or ``None``. }defaults* is a tuple of
default argument values or None if there are no default arguments; if this tuple
has {n} elements, they correspond to the last {n} elements listed in {args}.
.. versionchanged:: 2.6
Returns a named tuple ``ArgSpec(args, varargs, keywords,
defaults)``.
getargvalues(frame)~
Get information about arguments passed into a particular frame. A tuple of four
things is returned: ``(args, varargs, varkw, locals)``. {args} is a list of the
argument names (it may contain nested lists). {varargs} and {varkw} are the
names of the ``{`` and ``}{`` arguments or ``None``. }locals* is the locals
dictionary of the given frame.
.. versionchanged:: 2.6
Returns a named tuple ``ArgInfo(args, varargs, keywords,
locals)``.
formatargspec(args[, varargs, varkw, defaults, formatarg, formatvarargs, formatvarkw, formatvalue, join])~
Format a pretty argument spec from the four values returned by
getargspec. The format\* arguments are the corresponding optional
formatting functions that are called to turn names and values into strings.
formatargvalues(args[, varargs, varkw, locals, formatarg, formatvarargs, formatvarkw, formatvalue, join])~
Format a pretty argument spec from the four values returned by
getargvalues. The format\* arguments are the corresponding optional
formatting functions that are called to turn names and values into strings.
getmro(cls)~
Return a tuple of class cls's base classes, including cls, in method resolution
order. No class appears more than once in this tuple. Note that the method
resolution order depends on cls's type. Unless a very peculiar user-defined
metatype is in use, cls will be the first element of the tuple.
getcallargs(func[, {args][, }*kwds])~
Bind the {args} and {kwds} to the argument names of the Python function or
method {func}, as if it was called with them. For bound methods, bind also the
first argument (typically named ``self``) to the associated instance. A dict
is returned, mapping the argument names (including the names of the ``*`` and
``{`` arguments, if any) to their values from }args{ and }kwds*. In case of
invoking {func} incorrectly, i.e. whenever ``func({args, }*kwds)`` would raise
an exception because of incompatible signature, an exception of the same type
and the same or similar message is raised. For example:: >
>>> from inspect import getcallargs
>>> def f(a, b=1, {pos, }*named):
... pass
>>> getcallargs(f, 1, 2, 3)
{'a': 1, 'named': {}, 'b': 2, 'pos': (3,)}
>>> getcallargs(f, a=2, x=4)
{'a': 2, 'named': {'x': 4}, 'b': 1, 'pos': ()}
>>> getcallargs(f)
Traceback (most recent call last):
...
TypeError: f() takes at least 1 argument (0 given)
<
.. versionadded:: 2.7
The interpreter stack
---------------------
When the following functions return "frame records," each record is a tuple of
six items: the frame object, the filename, the line number of the current line,
the function name, a list of lines of context from the source code, and the
index of the current line within that list.
.. note::
Keeping references to frame objects, as found in the first element of the frame
records these functions return, can cause your program to create reference
cycles. Once a reference cycle has been created, the lifespan of all objects
which can be accessed from the objects which form the cycle can become much
longer even if Python's optional cycle detector is enabled. If such cycles must
be created, it is important to ensure they are explicitly broken to avoid the
delayed destruction of objects and increased memory consumption which occurs.
Though the cycle detector will catch these, destruction of the frames (and local
variables) can be made deterministic by removing the cycle in a
finally clause. This is also important if the cycle detector was
disabled when Python was compiled or using gc.disable. For example:: >
def handle_stackframe_without_leak():
frame = inspect.currentframe()
try:
# do something with the frame
finally:
del frame
<
The optional {context} argument supported by most of these functions specifies
the number of lines of context to return, which are centered around the current
line.
getframeinfo(frame[, context])~
Get information about a frame or traceback object. A 5-tuple is returned, the
last five elements of the frame's frame record.
.. versionchanged:: 2.6
Returns a named tuple ``Traceback(filename, lineno, function,
code_context, index)``.
getouterframes(frame[, context])~
Get a list of frame records for a frame and all outer frames. These frames
represent the calls that lead to the creation of {frame}. The first entry in the
returned list represents {frame}; the last entry represents the outermost call
on {frame}'s stack.
getinnerframes(traceback[, context])~
Get a list of frame records for a traceback's frame and all inner frames. These
frames represent calls made as a consequence of {frame}. The first entry in the
list represents {traceback}; the last entry represents where the exception was
raised.
currentframe()~
Return the frame object for the caller's stack frame.
.. impl-detail:: >
This function relies on Python stack frame support in the interpreter,
which isn't guaranteed to exist in all implementations of Python. If
running in an implementation without Python stack frame support this
function returns ``None``.
<
stack([context])~
Return a list of frame records for the caller's stack. The first entry in the
returned list represents the caller; the last entry represents the outermost
call on the stack.
trace([context])~
Return a list of frame records for the stack between the current frame and the
frame in which an exception currently being handled was raised in. The first
entry in the list represents the caller; the last entry represents where the
exception was raised.
==============================================================================
*py2stdlib-io*
io~
:synopsis: Core tools for working with streams.
The io (|py2stdlib-io|) module provides the Python interfaces to stream handling.
Under Python 2.x, this is proposed as an alternative to the built-in
file object, but in Python 3.x it is the default interface to
access files and streams.
.. note::
Since this module has been designed primarily for Python 3.x, you have to
be aware that all uses of "bytes" in this document refer to the
str type (of which bytes is an alias), and all uses
of "text" refer to the unicode type. Furthermore, those two
types are not interchangeable in the io (|py2stdlib-io|) APIs.
At the top of the I/O hierarchy is the abstract base class IOBase. It
defines the basic interface to a stream. Note, however, that there is no
separation between reading and writing to streams; implementations are allowed
to throw an IOError if they do not support a given operation.
Extending IOBase is RawIOBase which deals simply with the
reading and writing of raw bytes to a stream. FileIO subclasses
RawIOBase to provide an interface to files in the machine's
file system.
BufferedIOBase deals with buffering on a raw byte stream
(RawIOBase). Its subclasses, BufferedWriter,
BufferedReader, and BufferedRWPair buffer streams that are
readable, writable, and both readable and writable.
BufferedRandom provides a buffered interface to random access
streams. BytesIO is a simple stream of in-memory bytes.
Another IOBase subclass, TextIOBase, deals with
streams whose bytes represent text, and handles encoding and decoding
from and to unicode strings. TextIOWrapper, which extends
it, is a buffered text interface to a buffered raw stream
(BufferedIOBase). Finally, StringIO (|py2stdlib-stringio|) is an in-memory
stream for unicode text.
Argument names are not part of the specification, and only the arguments of
.open are intended to be used as keyword arguments.
Module Interface
----------------
DEFAULT_BUFFER_SIZE~
An int containing the default buffer size used by the module's buffered I/O
classes. .open uses the file's blksize (as obtained by
os.stat) if possible.
open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True)~
Open {file} and return a corresponding stream. If the file cannot be opened,
an IOError is raised.
{file} is either a string giving the name (and the path if the file isn't
in the current working directory) of the file to be opened or an integer
file descriptor of the file to be wrapped. (If a file descriptor is given,
for example, from os.fdopen, it is closed when the returned I/O
object is closed, unless {closefd} is set to ``False``.)
{mode} is an optional string that specifies the mode in which the file is
opened. It defaults to ``'r'`` which means open for reading in text mode.
Other common values are ``'w'`` for writing (truncating the file if it
already exists), and ``'a'`` for appending (which on {some} Unix systems,
means that {all} writes append to the end of the file regardless of the
current seek position). In text mode, if {encoding} is not specified the
encoding used is platform dependent. (For reading and writing raw bytes use
binary mode and leave {encoding} unspecified.) The available modes are:
========= ===============================================================
Character Meaning
--------- ---------------------------------------------------------------
``'r'`` open for reading (default)
``'w'`` open for writing, truncating the file first
``'a'`` open for writing, appending to the end of the file if it exists
``'b'`` binary mode
``'t'`` text mode (default)
``'+'`` open a disk file for updating (reading and writing)
``'U'`` universal newline mode (for backwards compatibility; should
not be used in new code)
========= ===============================================================
The default mode is ``'rt'`` (open for reading text). For binary random
access, the mode ``'w+b'`` opens and truncates the file to 0 bytes, while
``'r+b'`` opens the file without truncation.
Python distinguishes between files opened in binary and text modes, even when
the underlying operating system doesn't. Files opened in binary mode
(including ``'b'`` in the {mode} argument) return contents as bytes
objects without any decoding. In text mode (the default, or when ``'t'`` is
included in the {mode} argument), the contents of the file are returned as
unicode strings, the bytes having been first decoded using a
platform-dependent encoding or using the specified {encoding} if given.
{buffering} is an optional integer used to set the buffering policy.
Pass 0 to switch buffering off (only allowed in binary mode), 1 to select
line buffering (only usable in text mode), and an integer > 1 to indicate
the size of a fixed-size chunk buffer. When no {buffering} argument is
given, the default buffering policy works as follows:
* Binary files are buffered in fixed-size chunks; the size of the buffer
is chosen using a heuristic trying to determine the underlying device's
"block size" and falling back on DEFAULT_BUFFER_SIZE.
On many systems, the buffer will typically be 4096 or 8192 bytes long.
* "Interactive" text files (files for which isatty returns True)
use line buffering. Other text files use the policy described above
for binary files.
{encoding} is the name of the encoding used to decode or encode the file.
This should only be used in text mode. The default encoding is platform
dependent (whatever locale.getpreferredencoding returns), but any
encoding supported by Python can be used. See the codecs (|py2stdlib-codecs|) module for
the list of supported encodings.
{errors} is an optional string that specifies how encoding and decoding
errors are to be handled--this cannot be used in binary mode. Pass
``'strict'`` to raise a ValueError exception if there is an encoding
error (the default of ``None`` has the same effect), or pass ``'ignore'`` to
ignore errors. (Note that ignoring encoding errors can lead to data loss.)
``'replace'`` causes a replacement marker (such as ``'?'``) to be inserted
where there is malformed data. When writing, ``'xmlcharrefreplace'``
(replace with the appropriate XML character reference) or
``'backslashreplace'`` (replace with backslashed escape sequences) can be
used. Any other error handling name that has been registered with
codecs.register_error is also valid.
{newline} controls how universal newlines works (it only applies to text
mode). It can be ``None``, ``''``, ``'\n'``, ``'\r'``, and ``'\r\n'``. It
works as follows:
{ On input, if }newline* is ``None``, universal newlines mode is enabled.
Lines in the input can end in ``'\n'``, ``'\r'``, or ``'\r\n'``, and these
are translated into ``'\n'`` before being returned to the caller. If it is
``''``, universal newline mode is enabled, but line endings are returned to
the caller untranslated. If it has any of the other legal values, input
lines are only terminated by the given string, and the line ending is
returned to the caller untranslated.
{ On output, if }newline* is ``None``, any ``'\n'`` characters written are
translated to the system default line separator, os.linesep. If
{newline} is ``''``, no translation takes place. If {newline} is any of
the other legal values, any ``'\n'`` characters written are translated to
the given string.
If {closefd} is ``False`` and a file descriptor rather than a filename was
given, the underlying file descriptor will be kept open when the file is
closed. If a filename is given {closefd} has no effect and must be ``True``
(the default).
The type of file object returned by the .open function depends on the
mode. When .open is used to open a file in a text mode (``'w'``,
``'r'``, ``'wt'``, ``'rt'``, etc.), it returns a subclass of
TextIOBase (specifically TextIOWrapper). When used to open
a file in a binary mode with buffering, the returned class is a subclass of
BufferedIOBase. The exact class varies: in read binary mode, it
returns a BufferedReader; in write binary and append binary modes,
it returns a BufferedWriter, and in read/write mode, it returns a
BufferedRandom. When buffering is disabled, the raw stream, a
subclass of RawIOBase, FileIO, is returned.
It is also possible to use an unicode or bytes string
as a file for both reading and writing. For unicode strings
StringIO (|py2stdlib-stringio|) can be used like a file opened in text mode,
and for bytes a BytesIO can be used like a
file opened in a binary mode.
BlockingIOError~
Error raised when blocking would occur on a non-blocking stream. It inherits
IOError.
In addition to those of IOError, BlockingIOError has one
attribute:
characters_written~
An integer containing the number of characters written to the stream
before it blocked.
UnsupportedOperation~
An exception inheriting IOError and ValueError that is raised
when an unsupported operation is called on a stream.
I/O Base Classes
----------------
IOBase~
The abstract base class for all I/O classes, acting on streams of bytes.
There is no public constructor.
This class provides empty abstract implementations for many methods
that derived classes can override selectively; the default
implementations represent a file that cannot be read, written or
seeked.
Even though IOBase does not declare read, readinto,
or write because their signatures will vary, implementations and
clients should consider those methods part of the interface. Also,
implementations may raise a IOError when operations they do not
support are called.
The basic type used for binary data read from or written to a file is
bytes (also known as str). bytearray\s are
accepted too, and in some cases (such as readinto) required.
Text I/O classes work with unicode data.
Note that calling any method (even inquiries) on a closed stream is
undefined. Implementations may raise IOError in this case.
IOBase (and its subclasses) support the iterator protocol, meaning that an
IOBase object can be iterated over yielding the lines in a stream.
Lines are defined slightly differently depending on whether the stream is
a binary stream (yielding bytes), or a text stream (yielding
unicode strings). See readline (|py2stdlib-readline|) below.
IOBase is also a context manager and therefore supports the
with statement. In this example, {file} is closed after the
with statement's suite is finished---even if an exception occurs:: >
with io.open('spam.txt', 'w') as file:
file.write(u'Spam and eggs!')
<
IOBase provides these data attributes and methods:
close()~
Flush and close this stream. This method has no effect if the file is
already closed. Once the file is closed, any operation on the file
(e.g. reading or writing) will raise a ValueError.
As a convenience, it is allowed to call this method more than once;
only the first call, however, will have an effect.
closed~
True if the stream is closed.
fileno()~
Return the underlying file descriptor (an integer) of the stream if it
exists. An IOError is raised if the IO object does not use a file
descriptor.
flush()~
Flush the write buffers of the stream if applicable. This does nothing
for read-only and non-blocking streams.
isatty()~
Return ``True`` if the stream is interactive (i.e., connected to
a terminal/tty device).
readable()~
Return ``True`` if the stream can be read from. If False, read
will raise IOError.
readline(limit=-1)~
Read and return one line from the stream. If {limit} is specified, at
most {limit} bytes will be read.
The line terminator is always ``b'\n'`` for binary files; for text files,
the {newlines} argument to .open can be used to select the line
terminator(s) recognized.
readlines(hint=-1)~
Read and return a list of lines from the stream. {hint} can be specified
to control the number of lines read: no more lines will be read if the
total size (in bytes/characters) of all lines so far exceeds {hint}.
seek(offset, whence=SEEK_SET)~
Change the stream position to the given byte {offset}. {offset} is
interpreted relative to the position indicated by {whence}. Values for
{whence} are:
* SEEK_SET or ``0`` -- start of the stream (the default);
{offset} should be zero or positive
{ SEEK_CUR or ``1`` -- current stream position; }offset* may
be negative
{ SEEK_END or ``2`` -- end of the stream; }offset* is usually
negative
Return the new absolute position.
.. versionadded:: 2.7
The ``SEEK_*`` constants
seekable()~
Return ``True`` if the stream supports random access. If ``False``,
seek, tell and truncate will raise IOError.
tell()~
Return the current stream position.
truncate(size=None)~
Resize the stream to the given {size} in bytes (or the current position
if {size} is not specified). The current stream position isn't changed.
This resizing can extend or reduce the current file size. In case of
extension, the contents of the new file area depend on the platform
(on most systems, additional bytes are zero-filled, on Windows they're
undetermined). The new file size is returned.
writable()~
Return ``True`` if the stream supports writing. If ``False``,
write and truncate will raise IOError.
writelines(lines)~
Write a list of lines to the stream. Line separators are not added, so it
is usual for each of the lines provided to have a line separator at the
end.
RawIOBase~
Base class for raw binary I/O. It inherits IOBase. There is no
public constructor.
Raw binary I/O typically provides low-level access to an underlying OS
device or API, and does not try to encapsulate it in high-level primitives
(this is left to Buffered I/O and Text I/O, described later in this page).
In addition to the attributes and methods from IOBase,
RawIOBase provides the following methods:
read(n=-1)~
Read up to {n} bytes from the object and return them. As a convenience,
if {n} is unspecified or -1, readall is called. Otherwise,
only one system call is ever made. Fewer than {n} bytes may be
returned if the operating system call returns fewer than {n} bytes.
If 0 bytes are returned, and {n} was not 0, this indicates end of file.
If the object is in non-blocking mode and no bytes are available,
``None`` is returned.
readall()~
Read and return all the bytes from the stream until EOF, using multiple
calls to the stream if necessary.
readinto(b)~
Read up to len(b) bytes into bytearray {b} and return the number of bytes
read.
write(b)~
Write the given bytes or bytearray object, {b}, to the underlying raw
stream and return the number of bytes written. This can be less than
``len(b)``, depending on specifics of the underlying raw stream, and
especially if it is in non-blocking mode. ``None`` is returned if the
raw stream is set not to block and no single byte could be readily
written to it.
BufferedIOBase~
Base class for binary streams that support some kind of buffering.
It inherits IOBase. There is no public constructor.
The main difference with RawIOBase is that methods read,
readinto and write will try (respectively) to read as much
input as requested or to consume all given output, at the expense of
making perhaps more than one system call.
In addition, those methods can raise BlockingIOError if the
underlying raw stream is in non-blocking mode and cannot take or give
enough data; unlike their RawIOBase counterparts, they will
never return ``None``.
Besides, the read method does not have a default
implementation that defers to readinto.
A typical BufferedIOBase implementation should not inherit from a
RawIOBase implementation, but wrap one, like
BufferedWriter and BufferedReader do.
BufferedIOBase provides or overrides these members in addition to
those from IOBase:
raw~
The underlying raw stream (a RawIOBase instance) that
BufferedIOBase deals with. This is not part of the
BufferedIOBase API and may not exist on some implementations.
detach()~
Separate the underlying raw stream from the buffer and return it.
After the raw stream has been detached, the buffer is in an unusable
state.
Some buffers, like BytesIO, do not have the concept of a single
raw stream to return from this method. They raise
UnsupportedOperation.
.. versionadded:: 2.7
read(n=-1)~
Read and return up to {n} bytes. If the argument is omitted, ``None``, or
negative, data is read and returned until EOF is reached. An empty bytes
object is returned if the stream is already at EOF.
If the argument is positive, and the underlying raw stream is not
interactive, multiple raw reads may be issued to satisfy the byte count
(unless EOF is reached first). But for interactive raw streams, at most
one raw read will be issued, and a short result does not imply that EOF is
imminent.
A BlockingIOError is raised if the underlying raw stream is in
non blocking-mode, and has no data available at the moment.
read1(n=-1)~
Read and return up to {n} bytes, with at most one call to the underlying
raw stream's RawIOBase.read method. This can be useful if you
are implementing your own buffering on top of a BufferedIOBase
object.
readinto(b)~
Read up to len(b) bytes into bytearray {b} and return the number of bytes
read.
Like read, multiple reads may be issued to the underlying raw
stream, unless the latter is 'interactive'.
A BlockingIOError is raised if the underlying raw stream is in
non blocking-mode, and has no data available at the moment.
write(b)~
Write the given bytes or bytearray object, {b} and return the number
of bytes written (never less than ``len(b)``, since if the write fails
an IOError will be raised). Depending on the actual
implementation, these bytes may be readily written to the underlying
stream, or held in a buffer for performance and latency reasons.
When in non-blocking mode, a BlockingIOError is raised if the
data needed to be written to the raw stream but it couldn't accept
all the data without blocking.
Raw File I/O
------------
FileIO(name, mode='r', closefd=True)~
FileIO represents an OS-level file containing bytes data.
It implements the RawIOBase interface (and therefore the
IOBase interface, too).
The {name} can be one of two things:
* a string representing the path to the file which will be opened;
* an integer representing the number of an existing OS-level file descriptor
to which the resulting FileIO object will give access.
The {mode} can be ``'r'``, ``'w'`` or ``'a'`` for reading (default), writing,
or appending. The file will be created if it doesn't exist when opened for
writing or appending; it will be truncated when opened for writing. Add a
``'+'`` to the mode to allow simultaneous reading and writing.
The read (when called with a positive argument), readinto
and write methods on this class will only make one system call.
In addition to the attributes and methods from IOBase and
RawIOBase, FileIO provides the following data
attributes and methods:
mode~
The mode as given in the constructor.
name~
The file name. This is the file descriptor of the file when no name is
given in the constructor.
Buffered Streams
----------------
In many situations, buffered I/O streams will provide higher performance
(bandwidth and latency) than raw I/O streams. Their API is also more usable.
BytesIO([initial_bytes])~
A stream implementation using an in-memory bytes buffer. It inherits
BufferedIOBase.
The argument {initial_bytes} is an optional initial bytes.
BytesIO provides or overrides these methods in addition to those
from BufferedIOBase and IOBase:
getvalue()~
Return ``bytes`` containing the entire contents of the buffer.
read1()~
In BytesIO, this is the same as read.
BufferedReader(raw, buffer_size=DEFAULT_BUFFER_SIZE)~
A buffer providing higher-level access to a readable, sequential
RawIOBase object. It inherits BufferedIOBase.
When reading data from this object, a larger amount of data may be
requested from the underlying raw stream, and kept in an internal buffer.
The buffered data can then be returned directly on subsequent reads.
The constructor creates a BufferedReader for the given readable
{raw} stream and {buffer_size}. If {buffer_size} is omitted,
DEFAULT_BUFFER_SIZE is used.
BufferedReader provides or overrides these methods in addition to
those from BufferedIOBase and IOBase:
peek([n])~
Return bytes from the stream without advancing the position. At most one
single read on the raw stream is done to satisfy the call. The number of
bytes returned may be less or more than requested.
read([n])~
Read and return {n} bytes, or if {n} is not given or negative, until EOF
or if the read call would block in non-blocking mode.
read1(n)~
Read and return up to {n} bytes with only one call on the raw stream. If
at least one byte is buffered, only buffered bytes are returned.
Otherwise, one raw stream read call is made.
BufferedWriter(raw, buffer_size=DEFAULT_BUFFER_SIZE)~
A buffer providing higher-level access to a writeable, sequential
RawIOBase object. It inherits BufferedIOBase.
When writing to this object, data is normally held into an internal
buffer. The buffer will be written out to the underlying RawIOBase
object under various conditions, including:
* when the buffer gets too small for all pending data;
* when flush() is called;
* when a seek() is requested (for BufferedRandom objects);
* when the BufferedWriter object is closed or destroyed.
The constructor creates a BufferedWriter for the given writeable
{raw} stream. If the {buffer_size} is not given, it defaults to
DEFAULT_BUFFER_SIZE.
A third argument, {max_buffer_size}, is supported, but unused and deprecated.
BufferedWriter provides or overrides these methods in addition to
those from BufferedIOBase and IOBase:
flush()~
Force bytes held in the buffer into the raw stream. A
BlockingIOError should be raised if the raw stream blocks.
write(b)~
Write the bytes or bytearray object, {b} and return the number of bytes
written. When in non-blocking mode, a BlockingIOError is raised
if the buffer needs to be written out but the raw stream blocks.
BufferedRWPair(reader, writer, buffer_size=DEFAULT_BUFFER_SIZE)~
A buffered I/O object giving a combined, higher-level access to two
sequential RawIOBase objects: one readable, the other writeable.
It is useful for pairs of unidirectional communication channels
(pipes, for instance). It inherits BufferedIOBase.
{reader} and {writer} are RawIOBase objects that are readable and
writeable respectively. If the {buffer_size} is omitted it defaults to
DEFAULT_BUFFER_SIZE.
A fourth argument, {max_buffer_size}, is supported, but unused and
deprecated.
BufferedRWPair implements all of BufferedIOBase\'s methods
except for BufferedIOBase.detach, which raises
UnsupportedOperation.
BufferedRandom(raw, buffer_size=DEFAULT_BUFFER_SIZE)~
A buffered interface to random access streams. It inherits
BufferedReader and BufferedWriter, and further supports
seek and tell functionality.
The constructor creates a reader and writer for a seekable raw stream, given
in the first argument. If the {buffer_size} is omitted it defaults to
DEFAULT_BUFFER_SIZE.
A third argument, {max_buffer_size}, is supported, but unused and deprecated.
BufferedRandom is capable of anything BufferedReader or
BufferedWriter can do.
Text I/O
--------
TextIOBase~
Base class for text streams. This class provides an unicode character
and line based interface to stream I/O. There is no readinto
method because Python's unicode strings are immutable.
It inherits IOBase. There is no public constructor.
TextIOBase provides or overrides these data attributes and
methods in addition to those from IOBase:
encoding~
The name of the encoding used to decode the stream's bytes into
strings, and to encode strings into bytes.
errors~
The error setting of the decoder or encoder.
newlines~
A string, a tuple of strings, or ``None``, indicating the newlines
translated so far. Depending on the implementation and the initial
constructor flags, this may not be available.
buffer~
The underlying binary buffer (a BufferedIOBase instance) that
TextIOBase deals with. This is not part of the
TextIOBase API and may not exist on some implementations.
detach()~
Separate the underlying binary buffer from the TextIOBase and
return it.
After the underlying buffer has been detached, the TextIOBase is
in an unusable state.
Some TextIOBase implementations, like StringIO (|py2stdlib-stringio|), may not
have the concept of an underlying buffer and calling this method will
raise UnsupportedOperation.
.. versionadded:: 2.7
read(n)~
Read and return at most {n} characters from the stream as a single
unicode. If {n} is negative or ``None``, reads until EOF.
readline()~
Read until newline or EOF and return a single ``unicode``. If the
stream is already at EOF, an empty string is returned.
write(s)~
Write the unicode string {s} to the stream and return the
number of characters written.
TextIOWrapper(buffer, encoding=None, errors=None, newline=None, line_buffering=False)~
A buffered text stream over a BufferedIOBase binary stream.
It inherits TextIOBase.
{encoding} gives the name of the encoding that the stream will be decoded or
encoded with. It defaults to locale.getpreferredencoding.
{errors} is an optional string that specifies how encoding and decoding
errors are to be handled. Pass ``'strict'`` to raise a ValueError
exception if there is an encoding error (the default of ``None`` has the same
effect), or pass ``'ignore'`` to ignore errors. (Note that ignoring encoding
errors can lead to data loss.) ``'replace'`` causes a replacement marker
(such as ``'?'``) to be inserted where there is malformed data. When
writing, ``'xmlcharrefreplace'`` (replace with the appropriate XML character
reference) or ``'backslashreplace'`` (replace with backslashed escape
sequences) can be used. Any other error handling name that has been
registered with codecs.register_error is also valid.
{newline} can be ``None``, ``''``, ``'\n'``, ``'\r'``, or ``'\r\n'``. It
controls the handling of line endings. If it is ``None``, universal newlines
is enabled. With this enabled, on input, the lines endings ``'\n'``,
``'\r'``, or ``'\r\n'`` are translated to ``'\n'`` before being returned to
the caller. Conversely, on output, ``'\n'`` is translated to the system
default line separator, os.linesep. If {newline} is any other of its
legal values, that newline becomes the newline when the file is read and it
is returned untranslated. On output, ``'\n'`` is converted to the {newline}.
If {line_buffering} is ``True``, flush is implied when a call to
write contains a newline character.
TextIOWrapper provides one attribute in addition to those of
TextIOBase and its parents:
line_buffering~
Whether line buffering is enabled.
StringIO(initial_value=u'', newline=None)~
An in-memory stream for unicode text. It inherits TextIOWrapper.
The initial value of the buffer (an empty unicode string by default) can
be set by providing {initial_value}. The {newline} argument works like
that of TextIOWrapper. The default is to do no newline
translation.
StringIO (|py2stdlib-stringio|) provides this method in addition to those from
TextIOWrapper and its parents:
getvalue()~
Return a ``unicode`` containing the entire contents of the buffer at any
time before the StringIO (|py2stdlib-stringio|) object's close method is
called.
Example usage:: >
import io
output = io.StringIO()
output.write(u'First line.\n')
output.write(u'Second line.\n')
# Retrieve file contents -- this will be
# u'First line.\nSecond line.\n'
contents = output.getvalue()
# Close object and discard memory buffer --
# .getvalue() will now raise an exception.
output.close()
<
IncrementalNewlineDecoder~
A helper codec that decodes newlines for universal newlines mode. It
inherits codecs.IncrementalDecoder.
==============================================================================
*py2stdlib-itertools*
itertools~
:synopsis: Functions creating iterators for efficient looping.
.. testsetup::
from itertools import *
.. versionadded:: 2.3
This module implements a number of iterator building blocks inspired
by constructs from APL, Haskell, and SML. Each has been recast in a form
suitable for Python.
The module standardizes a core set of fast, memory efficient tools that are
useful by themselves or in combination. Together, they form an "iterator
algebra" making it possible to construct specialized tools succinctly and
efficiently in pure Python.
For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
sequence ``f(0), f(1), ...``. The same effect can be achieved in Python
by combining imap and count to form ``imap(f, count())``.
These tools and their built-in counterparts also work well with the high-speed
functions in the operator (|py2stdlib-operator|) module. For example, the multiplication
operator can be mapped across two vectors to form an efficient dot-product:
``sum(imap(operator.mul, vector1, vector2))``.
{Infinite Iterators:}*
================== ================= ================================================= =========================================
Iterator Arguments Results Example
================== ================= ================================================= =========================================
count start, [step] start, start+step, start+2*step, ... ``count(10) --> 10 11 12 13 14 ...``
cycle p p0, p1, ... plast, p0, p1, ... ``cycle('ABCD') --> A B C D A B C D ...``
repeat elem [,n] elem, elem, elem, ... endlessly or up to n times ``repeat(10, 3) --> 10 10 10``
================== ================= ================================================= =========================================
{Iterators terminating on the shortest input sequence:}*
==================== ============================ ================================================= =============================================================
Iterator Arguments Results Example
==================== ============================ ================================================= =============================================================
chain p, q, ... p0, p1, ... plast, q0, q1, ... ``chain('ABC', 'DEF') --> A B C D E F``
compress data, selectors (d[0] if s[0]), (d[1] if s[1]), ... ``compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F``
dropwhile pred, seq seq[n], seq[n+1], starting when pred fails ``dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1``
groupby iterable[, keyfunc] sub-iterators grouped by value of keyfunc(v)
ifilter pred, seq elements of seq where pred(elem) is True ``ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9``
ifilterfalse pred, seq elements of seq where pred(elem) is False ``ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8``
islice seq, [start,] stop [, step] elements from seq[start:stop:step] ``islice('ABCDEFG', 2, None) --> C D E F G``
imap func, p, q, ... func(p0, q0), func(p1, q1), ... ``imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000``
starmap func, seq func(\{seq[0]), func(\}seq[1]), ... ``starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000``
tee it, n it1, it2 , ... itn splits one iterator into n
takewhile pred, seq seq[0], seq[1], until pred fails ``takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4``
izip p, q, ... (p[0], q[0]), (p[1], q[1]), ... ``izip('ABCD', 'xy') --> Ax By``
izip_longest p, q, ... (p[0], q[0]), (p[1], q[1]), ... ``izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-``
==================== ============================ ================================================= =============================================================
{Combinatoric generators:}*
============================================== ==================== =============================================================
Iterator Arguments Results
============================================== ==================== =============================================================
product p, q, ... [repeat=1] cartesian product, equivalent to a nested for-loop
permutations p[, r] r-length tuples, all possible orderings, no repeated elements
combinations p, r r-length tuples, in sorted order, no repeated elements
combinations_with_replacement p, r r-length tuples, in sorted order, with repeated elements
|
``product('ABCD', repeat=2)`` ``AA AB AC AD BA BB BC BD CA CB CC CD DA DB DC DD``
``permutations('ABCD', 2)`` ``AB AC AD BA BC BD CA CB CD DA DB DC``
``combinations('ABCD', 2)`` ``AB AC AD BC BD CD``
``combinations_with_replacement('ABCD', 2)`` ``AA AB AC AD BB BC BD CC CD DD``
============================================== ==================== =============================================================
Itertool functions
------------------
The following module functions all construct and return iterators. Some provide
streams of infinite length, so they should only be accessed by functions or
loops that truncate the stream.
chain(*iterables)~
Make an iterator that returns elements from the first iterable until it is
exhausted, then proceeds to the next iterable, until all of the iterables are
exhausted. Used for treating consecutive sequences as a single sequence.
Equivalent to:: >
def chain(*iterables):
# chain('ABC', 'DEF') --> A B C D E F
for it in iterables:
for element in it:
yield element
<
itertools.chain.from_iterable(iterable)~
Alternate constructor for chain. Gets chained inputs from a
single iterable argument that is evaluated lazily. Equivalent to:: >
@classmethod
def from_iterable(iterables):
# chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
for it in iterables:
for element in it:
yield element
<
.. versionadded:: 2.6
combinations(iterable, r)~
Return {r} length subsequences of elements from the input {iterable}.
Combinations are emitted in lexicographic sort order. So, if the
input {iterable} is sorted, the combination tuples will be produced
in sorted order.
Elements are treated as unique based on their position, not on their
value. So if the input elements are unique, there will be no repeat
values in each combination.
Equivalent to:: >
def combinations(iterable, r):
# combinations('ABCD', 2) --> AB AC AD BC BD CD
# combinations(range(4), 3) --> 012 013 023 123
pool = tuple(iterable)
n = len(pool)
if r > n:
return
indices = range(r)
yield tuple(pool[i] for i in indices)
while True:
for i in reversed(range(r)):
if indices[i] != i + n - r:
break
else:
return
indices[i] += 1
for j in range(i+1, r):
indices[j] = indices[j-1] + 1
yield tuple(pool[i] for i in indices)
<
The code for combinations can be also expressed as a subsequence
of permutations after filtering entries where the elements are not
in sorted order (according to their position in the input pool):: >
def combinations(iterable, r):
pool = tuple(iterable)
n = len(pool)
for indices in permutations(range(n), r):
if sorted(indices) == list(indices):
yield tuple(pool[i] for i in indices)
<
The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
or zero when ``r > n``.
.. versionadded:: 2.6
combinations_with_replacement(iterable, r)~
Return {r} length subsequences of elements from the input {iterable}
allowing individual elements to be repeated more than once.
Combinations are emitted in lexicographic sort order. So, if the
input {iterable} is sorted, the combination tuples will be produced
in sorted order.
Elements are treated as unique based on their position, not on their
value. So if the input elements are unique, the generated combinations
will also be unique.
Equivalent to:: >
def combinations_with_replacement(iterable, r):
# combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC
pool = tuple(iterable)
n = len(pool)
if not n and r:
return
indices = [0] * r
yield tuple(pool[i] for i in indices)
while True:
for i in reversed(range(r)):
if indices[i] != n - 1:
break
else:
return
indices[i:] = [indices[i] + 1] * (r - i)
yield tuple(pool[i] for i in indices)
<
The code for combinations_with_replacement can be also expressed as
a subsequence of product after filtering entries where the elements
are not in sorted order (according to their position in the input pool):: >
def combinations_with_replacement(iterable, r):
pool = tuple(iterable)
n = len(pool)
for indices in product(range(n), repeat=r):
if sorted(indices) == list(indices):
yield tuple(pool[i] for i in indices)
<
The number of items returned is ``(n+r-1)! / r! / (n-1)!`` when ``n > 0``.
.. versionadded:: 2.7
compress(data, selectors)~
Make an iterator that filters elements from {data} returning only those that
have a corresponding element in {selectors} that evaluates to ``True``.
Stops when either the {data} or {selectors} iterables has been exhausted.
Equivalent to:: >
def compress(data, selectors):
# compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
return (d for d, s in izip(data, selectors) if s)
<
.. versionadded:: 2.7
count(start=0, step=1)~
Make an iterator that returns evenly spaced values starting with {n}. Often
used as an argument to imap to generate consecutive data points.
Also, used with izip to add sequence numbers. Equivalent to:: >
def count(start=0, step=1):
# count(10) --> 10 11 12 13 14 ...
# count(2.5, 0.5) -> 3.5 3.0 4.5 ...
n = start
while True:
yield n
n += step
<
When counting with floating point numbers, better accuracy can sometimes be
achieved by substituting multiplicative code such as: ``(start + step * i
for i in count())``.
.. versionchanged:: 2.7
added {step} argument and allowed non-integer arguments.
cycle(iterable)~
Make an iterator returning elements from the iterable and saving a copy of each.
When the iterable is exhausted, return elements from the saved copy. Repeats
indefinitely. Equivalent to:: >
def cycle(iterable):
# cycle('ABCD') --> A B C D A B C D A B C D ...
saved = []
for element in iterable:
yield element
saved.append(element)
while saved:
for element in saved:
yield element
<
Note, this member of the toolkit may require significant auxiliary storage
(depending on the length of the iterable).
dropwhile(predicate, iterable)~
Make an iterator that drops elements from the iterable as long as the predicate
is true; afterwards, returns every element. Note, the iterator does not produce
{any} output until the predicate first becomes false, so it may have a lengthy
start-up time. Equivalent to:: >
def dropwhile(predicate, iterable):
# dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
iterable = iter(iterable)
for x in iterable:
if not predicate(x):
yield x
break
for x in iterable:
yield x
<
groupby(iterable[, key])~
Make an iterator that returns consecutive keys and groups from the {iterable}.
The {key} is a function computing a key value for each element. If not
specified or is ``None``, {key} defaults to an identity function and returns
the element unchanged. Generally, the iterable needs to already be sorted on
the same key function.
The operation of groupby is similar to the ``uniq`` filter in Unix. It
generates a break or new group every time the value of the key function changes
(which is why it is usually necessary to have sorted the data using the same key
function). That behavior differs from SQL's GROUP BY which aggregates common
elements regardless of their input order.
The returned group is itself an iterator that shares the underlying iterable
with groupby. Because the source is shared, when the groupby
object is advanced, the previous group is no longer visible. So, if that data
is needed later, it should be stored as a list:: >
groups = []
uniquekeys = []
data = sorted(data, key=keyfunc)
for k, g in groupby(data, keyfunc):
groups.append(list(g)) # Store group iterator as a list
uniquekeys.append(k)
<
groupby is equivalent to::
class groupby(object):
# [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
# [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
def __init__(self, iterable, key=None):
if key is None:
key = lambda x: x
self.keyfunc = key
self.it = iter(iterable)
self.tgtkey = self.currkey = self.currvalue = object()
def __iter__(self):
return self
def next(self):
while self.currkey == self.tgtkey:
self.currvalue = next(self.it) # Exit on StopIteration
self.currkey = self.keyfunc(self.currvalue)
self.tgtkey = self.currkey
return (self.currkey, self._grouper(self.tgtkey))
def _grouper(self, tgtkey):
while self.currkey == tgtkey:
yield self.currvalue
self.currvalue = next(self.it) # Exit on StopIteration
self.currkey = self.keyfunc(self.currvalue)
.. versionadded:: 2.4
ifilter(predicate, iterable)~
Make an iterator that filters elements from iterable returning only those for
which the predicate is ``True``. If {predicate} is ``None``, return the items
that are true. Equivalent to:: >
def ifilter(predicate, iterable):
# ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9
if predicate is None:
predicate = bool
for x in iterable:
if predicate(x):
yield x
<
ifilterfalse(predicate, iterable)~
Make an iterator that filters elements from iterable returning only those for
which the predicate is ``False``. If {predicate} is ``None``, return the items
that are false. Equivalent to:: >
def ifilterfalse(predicate, iterable):
# ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
if predicate is None:
predicate = bool
for x in iterable:
if not predicate(x):
yield x
<
imap(function, *iterables)~
Make an iterator that computes the function using arguments from each of the
iterables. If {function} is set to ``None``, then imap returns the
arguments as a tuple. Like map but stops when the shortest iterable is
exhausted instead of filling in ``None`` for shorter iterables. The reason for
the difference is that infinite iterator arguments are typically an error for
map (because the output is fully evaluated) but represent a common and
useful way of supplying arguments to imap. Equivalent to:: >
def imap(function, *iterables):
# imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000
iterables = map(iter, iterables)
while True:
args = [next(it) for it in iterables]
if function is None:
yield tuple(args)
else:
yield function(*args)
<
islice(iterable, [start,] stop [, step])~
Make an iterator that returns selected elements from the iterable. If {start} is
non-zero, then elements from the iterable are skipped until start is reached.
Afterward, elements are returned consecutively unless {step} is set higher than
one which results in items being skipped. If {stop} is ``None``, then iteration
continues until the iterator is exhausted, if at all; otherwise, it stops at the
specified position. Unlike regular slicing, islice does not support
negative values for {start}, {stop}, or {step}. Can be used to extract related
fields from data where the internal structure has been flattened (for example, a
multi-line report may list a name field on every third line). Equivalent to:: >
def islice(iterable, *args):
# islice('ABCDEFG', 2) --> A B
# islice('ABCDEFG', 2, 4) --> C D
# islice('ABCDEFG', 2, None) --> C D E F G
# islice('ABCDEFG', 0, None, 2) --> A C E G
s = slice(*args)
it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))
nexti = next(it)
for i, element in enumerate(iterable):
if i == nexti:
yield element
nexti = next(it)
<
If {start} is ``None``, then iteration starts at zero. If {step} is ``None``,
then the step defaults to one.
.. versionchanged:: 2.5
accept ``None`` values for default {start} and {step}.
izip(*iterables)~
Make an iterator that aggregates elements from each of the iterables. Like
zip except that it returns an iterator instead of a list. Used for
lock-step iteration over several iterables at a time. Equivalent to:: >
def izip(*iterables):
# izip('ABCD', 'xy') --> Ax By
iterables = map(iter, iterables)
while iterables:
yield tuple(map(next, iterables))
<
.. versionchanged:: 2.4
When no iterables are specified, returns a zero length iterator instead of
raising a TypeError exception.
The left-to-right evaluation order of the iterables is guaranteed. This
makes possible an idiom for clustering a data series into n-length groups
using ``izip({[iter(s)]}n)``.
izip should only be used with unequal length inputs when you don't
care about trailing, unmatched values from the longer iterables. If those
values are important, use izip_longest instead.
izip_longest(*iterables[, fillvalue])~
Make an iterator that aggregates elements from each of the iterables. If the
iterables are of uneven length, missing values are filled-in with {fillvalue}.
Iteration continues until the longest iterable is exhausted. Equivalent to:: >
def izip_longest({args, }*kwds):
# izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
fillvalue = kwds.get('fillvalue')
def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
yield counter() # yields the fillvalue, or raises IndexError
fillers = repeat(fillvalue)
iters = [chain(it, sentinel(), fillers) for it in args]
try:
for tup in izip(*iters):
yield tup
except IndexError:
pass
<
If one of the iterables is potentially infinite, then the
izip_longest function should be wrapped with something that limits
the number of calls (for example islice or takewhile). If
not specified, {fillvalue} defaults to ``None``.
.. versionadded:: 2.6
permutations(iterable[, r])~
Return successive {r} length permutations of elements in the {iterable}.
If {r} is not specified or is ``None``, then {r} defaults to the length
of the {iterable} and all possible full-length permutations
are generated.
Permutations are emitted in lexicographic sort order. So, if the
input {iterable} is sorted, the permutation tuples will be produced
in sorted order.
Elements are treated as unique based on their position, not on their
value. So if the input elements are unique, there will be no repeat
values in each permutation.
Equivalent to:: >
def permutations(iterable, r=None):
# permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
# permutations(range(3)) --> 012 021 102 120 201 210
pool = tuple(iterable)
n = len(pool)
r = n if r is None else r
if r > n:
return
indices = range(n)
cycles = range(n, n-r, -1)
yield tuple(pool[i] for i in indices[:r])
while n:
for i in reversed(range(r)):
cycles[i] -= 1
if cycles[i] == 0:
indices[i:] = indices[i+1:] + indices[i:i+1]
cycles[i] = n - i
else:
j = cycles[i]
indices[i], indices[-j] = indices[-j], indices[i]
yield tuple(pool[i] for i in indices[:r])
break
else:
return
<
The code for permutations can be also expressed as a subsequence of
product, filtered to exclude entries with repeated elements (those
from the same position in the input pool):: >
def permutations(iterable, r=None):
pool = tuple(iterable)
n = len(pool)
r = n if r is None else r
for indices in product(range(n), repeat=r):
if len(set(indices)) == r:
yield tuple(pool[i] for i in indices)
<
The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
or zero when ``r > n``.
.. versionadded:: 2.6
product(*iterables[, repeat])~
Cartesian product of input iterables.
Equivalent to nested for-loops in a generator expression. For example,
``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
The nested loops cycle like an odometer with the rightmost element advancing
on every iteration. This pattern creates a lexicographic ordering so that if
the input's iterables are sorted, the product tuples are emitted in sorted
order.
To compute the product of an iterable with itself, specify the number of
repetitions with the optional {repeat} keyword argument. For example,
``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
This function is equivalent to the following code, except that the
actual implementation does not build up intermediate results in memory:: >
def product({args, }*kwds):
# product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
# product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
pools = map(tuple, args) * kwds.get('repeat', 1)
result = [[]]
for pool in pools:
result = [x+[y] for x in result for y in pool]
for prod in result:
yield tuple(prod)
<
.. versionadded:: 2.6
repeat(object[, times])~
Make an iterator that returns {object} over and over again. Runs indefinitely
unless the {times} argument is specified. Used as argument to imap for
invariant function parameters. Also used with izip to create constant
fields in a tuple record. Equivalent to:: >
def repeat(object, times=None):
# repeat(10, 3) --> 10 10 10
if times is None:
while True:
yield object
else:
for i in xrange(times):
yield object
<
starmap(function, iterable)~
Make an iterator that computes the function using arguments obtained from
the iterable. Used instead of imap when argument parameters are already
grouped in tuples from a single iterable (the data has been "pre-zipped"). The
difference between imap and starmap parallels the distinction
between ``function(a,b)`` and ``function(*c)``. Equivalent to:: >
def starmap(function, iterable):
# starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
for args in iterable:
yield function(*args)
<
.. versionchanged:: 2.6
Previously, starmap required the function arguments to be tuples.
Now, any iterable is allowed.
takewhile(predicate, iterable)~
Make an iterator that returns elements from the iterable as long as the
predicate is true. Equivalent to:: >
def takewhile(predicate, iterable):
# takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
for x in iterable:
if predicate(x):
yield x
else:
break
<
tee(iterable[, n=2])~
Return {n} independent iterators from a single iterable. Equivalent to:: >
def tee(iterable, n=2):
it = iter(iterable)
deques = [collections.deque() for i in range(n)]
def gen(mydeque):
while True:
if not mydeque: # when the local deque is empty
newval = next(it) # fetch a new value and
for d in deques: # load it to all the deques
d.append(newval)
yield mydeque.popleft()
return tuple(gen(d) for d in deques)
<
Once tee has made a split, the original {iterable} should not be
used anywhere else; otherwise, the {iterable} could get advanced without
the tee objects being informed.
This itertool may require significant auxiliary storage (depending on how
much temporary data needs to be stored). In general, if one iterator uses
most or all of the data before another iterator starts, it is faster to use
list instead of tee.
.. versionadded:: 2.4
Recipes
-------
This section shows recipes for creating an extended toolset using the existing
itertools as building blocks.
The extended tools offer the same high performance as the underlying toolset.
The superior memory performance is kept by processing elements one at a time
rather than bringing the whole iterable into memory all at once. Code volume is
kept small by linking the tools together in a functional style which helps
eliminate temporary variables. High speed is retained by preferring
"vectorized" building blocks over the use of for-loops and generator\s
which incur interpreter overhead.
.. testcode::
def take(n, iterable):
"Return first n items of the iterable as a list"
return list(islice(iterable, n))
def tabulate(function, start=0):
"Return function(0), function(1), ..."
return imap(function, count(start))
def consume(iterator, n):
"Advance the iterator n-steps ahead. If n is none, consume entirely."
# Use functions that consume iterators at C speed.
if n is None:
# feed the entire iterator into a zero-length deque
collections.deque(iterator, maxlen=0)
else:
# advance to the emtpy slice starting at position n
next(islice(iterator, n, n), None)
def nth(iterable, n, default=None):
"Returns the nth item or a default value"
return next(islice(iterable, n, None), default)
def quantify(iterable, pred=bool):
"Count how many times the predicate is true"
return sum(imap(pred, iterable))
def padnone(iterable):
"""Returns the sequence elements and then returns None indefinitely.
Useful for emulating the behavior of the built-in map() function.
"""
return chain(iterable, repeat(None))
def ncycles(iterable, n):
"Returns the sequence elements n times"
return chain.from_iterable(repeat(tuple(iterable), n))
def dotproduct(vec1, vec2):
return sum(imap(operator.mul, vec1, vec2))
def flatten(listOfLists):
"Flatten one level of nesting"
return chain.from_iterable(listOfLists)
def repeatfunc(func, times=None, *args):
"""Repeat calls to func with specified arguments.
Example: repeatfunc(random.random)
"""
if times is None:
return starmap(func, repeat(args))
return starmap(func, repeat(args, times))
def pairwise(iterable):
"s -> (s0,s1), (s1,s2), (s2, s3), ..."
a, b = tee(iterable)
next(b, None)
return izip(a, b)
def grouper(n, iterable, fillvalue=None):
"grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
args = [iter(iterable)] * n
return izip_longest(fillvalue=fillvalue, *args)
def roundrobin(*iterables):
"roundrobin('ABC', 'D', 'EF') --> A D E B F C"
# Recipe credited to George Sakkis
pending = len(iterables)
nexts = cycle(iter(it).next for it in iterables)
while pending:
try:
for next in nexts:
yield next()
except StopIteration:
pending -= 1
nexts = cycle(islice(nexts, pending))
def powerset(iterable):
"powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
s = list(iterable)
return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
def unique_everseen(iterable, key=None):
"List unique elements, preserving order. Remember all elements ever seen."
# unique_everseen('AAAABBBCCDAABBB') --> A B C D
# unique_everseen('ABBCcAD', str.lower) --> A B C D
seen = set()
seen_add = seen.add
if key is None:
for element in ifilterfalse(seen.__contains__, iterable):
seen_add(element)
yield element
else:
for element in iterable:
k = key(element)
if k not in seen:
seen_add(k)
yield element
def unique_justseen(iterable, key=None):
"List unique elements, preserving order. Remember only the element just seen."
# unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
# unique_justseen('ABBCcAD', str.lower) --> A B C A D
return imap(next, imap(itemgetter(1), groupby(iterable, key)))
def iter_except(func, exception, first=None):
""" Call a function repeatedly until an exception is raised.
Converts a call-until-exception interface to an iterator interface.
Like __builtin__.iter(func, sentinel) but uses an exception instead
of a sentinel to end the loop.
Examples:
bsddbiter = iter_except(db.next, bsddb.error, db.first)
heapiter = iter_except(functools.partial(heappop, h), IndexError)
dictiter = iter_except(d.popitem, KeyError)
dequeiter = iter_except(d.popleft, IndexError)
queueiter = iter_except(q.get_nowait, Queue.Empty)
setiter = iter_except(s.pop, KeyError)
"""
try:
if first is not None:
yield first()
while 1:
yield func()
except exception:
pass
def random_product({args, }*kwds):
"Random selection from itertools.product({args, }*kwds)"
pools = map(tuple, args) * kwds.get('repeat', 1)
return tuple(random.choice(pool) for pool in pools)
def random_permutation(iterable, r=None):
"Random selection from itertools.permutations(iterable, r)"
pool = tuple(iterable)
r = len(pool) if r is None else r
return tuple(random.sample(pool, r))
def random_combination(iterable, r):
"Random selection from itertools.combinations(iterable, r)"
pool = tuple(iterable)
n = len(pool)
indices = sorted(random.sample(xrange(n), r))
return tuple(pool[i] for i in indices)
def random_combination_with_replacement(iterable, r):
"Random selection from itertools.combinations_with_replacement(iterable, r)"
pool = tuple(iterable)
n = len(pool)
indices = sorted(random.randrange(n) for i in xrange(r))
return tuple(pool[i] for i in indices)
Note, many of the above recipes can be optimized by replacing global lookups
with local variables defined as default values. For example, the
{dotproduct} recipe can be written as:: >
def dotproduct(vec1, vec2, sum=sum, imap=imap, mul=operator.mul):
return sum(imap(mul, vec1, vec2))
==============================================================================
*py2stdlib-icopen*
icopen~
:platform: Mac
:synopsis: Internet Config replacement for open().
:deprecated:
Importing icopen (|py2stdlib-icopen|) will replace the built-in open with a version
that uses Internet Config to set file type and creator for new files.
2.6~
macerrors (|py2stdlib-macerrors|) --- Mac OS Errors
----------------------------------
==============================================================================
*py2stdlib-jpeg*
jpeg~
:platform: IRIX
:synopsis: Read and write image files in compressed JPEG format.
:deprecated:
2.6~
The jpeg (|py2stdlib-jpeg|) module has been deprecated for removal in Python 3.0.
.. index:: single: Independent JPEG Group
The module jpeg (|py2stdlib-jpeg|) provides access to the jpeg compressor and decompressor
written by the Independent JPEG Group (IJG). JPEG is a standard for compressing
pictures; it is defined in ISO 10918. For details on JPEG or the Independent
JPEG Group software refer to the JPEG standard or the documentation provided
with the software.
.. index::
single: Python Imaging Library
single: PIL (the Python Imaging Library)
single: Lundh, Fredrik
A portable interface to JPEG image files is available with the Python Imaging
Library (PIL) by Fredrik Lundh. Information on PIL is available at
http://www.pythonware.com/products/pil/.
The jpeg (|py2stdlib-jpeg|) module defines an exception and some functions.
error~
Exception raised by compress and decompress in case of errors.
compress(data, w, h, b)~
.. index:: single: JFIF
Treat data as a pixmap of width {w} and height {h}, with {b} bytes per pixel.
The data is in SGI GL order, so the first pixel is in the lower-left corner.
This means that gl.lrectread return data can immediately be passed to
compress. Currently only 1 byte and 4 byte pixels are allowed, the
former being treated as greyscale and the latter as RGB color. compress
returns a string that contains the compressed picture, in JFIF format.
decompress(data)~
.. index:: single: JFIF
Data is a string containing a picture in JFIF format. It returns a tuple
``(data, width, height, bytesperpixel)``. Again, the data is suitable to pass
to gl.lrectwrite.
setoption(name, value)~
Set various options. Subsequent compress and decompress calls
will use these options. The following options are available:
+-----------------+---------------------------------------------+
| Option | Effect |
+=================+=============================================+
| ``'forcegray'`` | Force output to be grayscale, even if input |
| | is RGB. |
+-----------------+---------------------------------------------+
| ``'quality'`` | Set the quality of the compressed image to |
| | a value between ``0`` and ``100`` (default |
| | is ``75``). This only affects compression. |
+-----------------+---------------------------------------------+
| ``'optimize'`` | Perform Huffman table optimization. Takes |
| | longer, but results in smaller compressed |
| | image. This only affects compression. |
+-----------------+---------------------------------------------+
| ``'smooth'`` | Perform inter-block smoothing on |
| | uncompressed image. Only useful for low- |
| | quality images. This only affects |
| | decompression. |
+-----------------+---------------------------------------------+
.. seealso::
JPEG Still Image Data Compression Standard
The canonical reference for the JPEG image format, by Pennebaker and Mitchell.
`Information Technology - Digital Compression and Coding of Continuous-tone Still Images - Requirements and Guidelines <http://www.w3.org/Graphics/JPEG/itu-t81.pdf>`_
The ISO standard for JPEG is also published as ITU T.81. This is available
online in PDF form.
==============================================================================
*py2stdlib-json*
json~
:synopsis: Encode and decode the JSON format.
.. versionadded:: 2.6
JSON (JavaScript Object Notation) <http://json.org> is a subset of JavaScript
syntax (ECMA-262 3rd edition) used as a lightweight data interchange format.
json (|py2stdlib-json|) exposes an API familiar to users of the standard library
marshal (|py2stdlib-marshal|) and pickle (|py2stdlib-pickle|) modules.
Encoding basic Python object hierarchies:: >
>>> import json
>>> json.dumps(['foo', {'bar': ('baz', None, 1.0, 2)}])
'["foo", {"bar": ["baz", null, 1.0, 2]}]'
>>> print json.dumps("\"foo\bar")
"\"foo\bar"
>>> print json.dumps(u'\u1234')
"\u1234"
>>> print json.dumps('\\')
"\\"
>>> print json.dumps({"c": 0, "b": 0, "a": 0}, sort_keys=True)
{"a": 0, "b": 0, "c": 0}
>>> from StringIO import StringIO
>>> io = StringIO()
>>> json.dump(['streaming API'], io)
>>> io.getvalue()
'["streaming API"]'
<
Compact encoding::
>>> import json
>>> json.dumps([1,2,3,{'4': 5, '6': 7}], separators=(',',':'))
'[1,2,3,{"4":5,"6":7}]'
Pretty printing:: >
>>> import json
>>> print json.dumps({'4': 5, '6': 7}, sort_keys=True, indent=4)
{
"4": 5,
"6": 7
}
<
Decoding JSON::
>>> import json
>>> json.loads('["foo", {"bar":["baz", null, 1.0, 2]}]')
[u'foo', {u'bar': [u'baz', None, 1.0, 2]}]
>>> json.loads('"\\"foo\\bar"')
u'"foo\x08ar'
>>> from StringIO import StringIO
>>> io = StringIO('["streaming API"]')
>>> json.load(io)
[u'streaming API']
Specializing JSON object decoding:: >
>>> import json
>>> def as_complex(dct):
... if '__complex__' in dct:
... return complex(dct['real'], dct['imag'])
... return dct
...
>>> json.loads('{"__complex__": true, "real": 1, "imag": 2}',
... object_hook=as_complex)
(1+2j)
>>> import decimal
>>> json.loads('1.1', parse_float=decimal.Decimal)
Decimal('1.1')
<
Extending JSONEncoder::
>>> import json
>>> class ComplexEncoder(json.JSONEncoder):
... def default(self, obj):
... if isinstance(obj, complex):
... return [obj.real, obj.imag]
... return json.JSONEncoder.default(self, obj)
...
>>> dumps(2 + 1j, cls=ComplexEncoder)
'[2.0, 1.0]'
>>> ComplexEncoder().encode(2 + 1j)
'[2.0, 1.0]'
>>> list(ComplexEncoder().iterencode(2 + 1j))
['[', '2.0', ', ', '1.0', ']']
.. highlight:: none
Using json.tool from the shell to validate and pretty-print:: >
$ echo '{"json":"obj"}' | python -mjson.tool
{
"json": "obj"
}
$ echo '{ 1.2:3.4}' | python -mjson.tool
Expecting property name: line 1 column 2 (char 2)
<
.. highlight:: python
.. note::
The JSON produced by this module's default settings is a subset of
YAML, so it may be used as a serializer for that as well.
Basic Usage
-----------
dump(obj, fp[, skipkeys[, ensure_ascii[, check_circular[, allow_nan[, cls[, indent[, separators[, encoding[, default[, {}kw]]]]]]]]]])~
Serialize {obj} as a JSON formatted stream to {fp} (a ``.write()``-supporting
file-like object).
If {skipkeys} is ``True`` (default: ``False``), then dict keys that are not
of a basic type (str, unicode, int, long,
float, bool, ``None``) will be skipped instead of raising a
TypeError.
If {ensure_ascii} is ``False`` (default: ``True``), then some chunks written
to {fp} may be unicode instances, subject to normal Python
str to unicode coercion rules. Unless ``fp.write()``
explicitly understands unicode (as in codecs.getwriter) this
is likely to cause an error.
If {check_circular} is ``False`` (default: ``True``), then the circular
reference check for container types will be skipped and a circular reference
will result in an OverflowError (or worse).
If {allow_nan} is ``False`` (default: ``True``), then it will be a
ValueError to serialize out of range float values (``nan``,
``inf``, ``-inf``) in strict compliance of the JSON specification, instead of
using the JavaScript equivalents (``NaN``, ``Infinity``, ``-Infinity``).
If {indent} is a non-negative integer, then JSON array elements and object
members will be pretty-printed with that indent level. An indent level of 0
will only insert newlines. ``None`` (the default) selects the most compact
representation.
If {separators} is an ``(item_separator, dict_separator)`` tuple, then it
will be used instead of the default ``(', ', ': ')`` separators. ``(',',
':')`` is the most compact JSON representation.
{encoding} is the character encoding for str instances, default is UTF-8.
{default(obj)} is a function that should return a serializable version of
{obj} or raise TypeError. The default simply raises TypeError.
To use a custom JSONEncoder subclass (e.g. one that overrides the
default method to serialize additional types), specify it with the
{cls} kwarg.
dumps(obj[, skipkeys[, ensure_ascii[, check_circular[, allow_nan[, cls[, indent[, separators[, encoding[, default[, {}kw]]]]]]]]]])~
Serialize {obj} to a JSON formatted str.
If {ensure_ascii} is ``False``, then the return value will be a
unicode instance. The other arguments have the same meaning as in
dump.
load(fp[, encoding[, cls[, object_hook[, parse_float[, parse_int[, parse_constant[, object_pairs_hook[, {}kw]]]]]]]])~
Deserialize {fp} (a ``.read()``-supporting file-like object containing a JSON
document) to a Python object.
If the contents of {fp} are encoded with an ASCII based encoding other than
UTF-8 (e.g. latin-1), then an appropriate {encoding} name must be specified.
Encodings that are not ASCII based (such as UCS-2) are not allowed, and
should be wrapped with ``codecs.getreader(encoding)(fp)``, or simply decoded
to a unicode object and passed to loads.
{object_hook} is an optional function that will be called with the result of
any object literal decoded (a dict). The return value of
{object_hook} will be used instead of the dict. This feature can be used
to implement custom decoders (e.g. JSON-RPC class hinting).
{object_pairs_hook} is an optional function that will be called with the
result of any object literal decoded with an ordered list of pairs. The
return value of {object_pairs_hook} will be used instead of the
dict. This feature can be used to implement custom decoders that
rely on the order that the key and value pairs are decoded (for example,
collections.OrderedDict will remember the order of insertion). If
{object_hook} is also defined, the {object_pairs_hook} takes priority.
.. versionchanged:: 2.7
Added support for {object_pairs_hook}.
{parse_float}, if specified, will be called with the string of every JSON
float to be decoded. By default, this is equivalent to ``float(num_str)``.
This can be used to use another datatype or parser for JSON floats
(e.g. decimal.Decimal).
{parse_int}, if specified, will be called with the string of every JSON int
to be decoded. By default, this is equivalent to ``int(num_str)``. This can
be used to use another datatype or parser for JSON integers
(e.g. float).
{parse_constant}, if specified, will be called with one of the following
strings: ``'-Infinity'``, ``'Infinity'``, ``'NaN'``, ``'null'``, ``'true'``,
``'false'``. This can be used to raise an exception if invalid JSON numbers
are encountered.
To use a custom JSONDecoder subclass, specify it with the ``cls``
kwarg. Additional keyword arguments will be passed to the constructor of the
class.
loads(s[, encoding[, cls[, object_hook[, parse_float[, parse_int[, parse_constant[, object_pairs_hook[, {}kw]]]]]]]])~
Deserialize {s} (a str or unicode instance containing a JSON
document) to a Python object.
If {s} is a str instance and is encoded with an ASCII based encoding
other than UTF-8 (e.g. latin-1), then an appropriate {encoding} name must be
specified. Encodings that are not ASCII based (such as UCS-2) are not
allowed and should be decoded to unicode first.
The other arguments have the same meaning as in load.
Encoders and decoders
---------------------
JSONDecoder([encoding[, object_hook[, parse_float[, parse_int[, parse_constant[, strict[, object_pairs_hook]]]]]]])~
Simple JSON decoder.
Performs the following translations in decoding by default:
+---------------+-------------------+
| JSON | Python |
+===============+===================+
| object | dict |
+---------------+-------------------+
| array | list |
+---------------+-------------------+
| string | unicode |
+---------------+-------------------+
| number (int) | int, long |
+---------------+-------------------+
| number (real) | float |
+---------------+-------------------+
| true | True |
+---------------+-------------------+
| false | False |
+---------------+-------------------+
| null | None |
+---------------+-------------------+
It also understands ``NaN``, ``Infinity``, and ``-Infinity`` as their
corresponding ``float`` values, which is outside the JSON spec.
{encoding} determines the encoding used to interpret any str objects
decoded by this instance (UTF-8 by default). It has no effect when decoding
unicode objects.
Note that currently only encodings that are a superset of ASCII work, strings
of other encodings should be passed in as unicode.
{object_hook}, if specified, will be called with the result of every JSON
object decoded and its return value will be used in place of the given
dict. This can be used to provide custom deserializations (e.g. to
support JSON-RPC class hinting).
{object_pairs_hook}, if specified will be called with the result of every
JSON object decoded with an ordered list of pairs. The return value of
{object_pairs_hook} will be used instead of the dict. This
feature can be used to implement custom decoders that rely on the order
that the key and value pairs are decoded (for example,
collections.OrderedDict will remember the order of insertion). If
{object_hook} is also defined, the {object_pairs_hook} takes priority.
.. versionchanged:: 2.7
Added support for {object_pairs_hook}.
{parse_float}, if specified, will be called with the string of every JSON
float to be decoded. By default, this is equivalent to ``float(num_str)``.
This can be used to use another datatype or parser for JSON floats
(e.g. decimal.Decimal).
{parse_int}, if specified, will be called with the string of every JSON int
to be decoded. By default, this is equivalent to ``int(num_str)``. This can
be used to use another datatype or parser for JSON integers
(e.g. float).
{parse_constant}, if specified, will be called with one of the following
strings: ``'-Infinity'``, ``'Infinity'``, ``'NaN'``, ``'null'``, ``'true'``,
``'false'``. This can be used to raise an exception if invalid JSON numbers
are encountered.
decode(s)~
Return the Python representation of {s} (a str or
unicode instance containing a JSON document)
raw_decode(s)~
Decode a JSON document from {s} (a str or unicode
beginning with a JSON document) and return a 2-tuple of the Python
representation and the index in {s} where the document ended.
This can be used to decode a JSON document from a string that may have
extraneous data at the end.
JSONEncoder([skipkeys[, ensure_ascii[, check_circular[, allow_nan[, sort_keys[, indent[, separators[, encoding[, default]]]]]]]]])~
Extensible JSON encoder for Python data structures.
Supports the following objects and types by default:
+-------------------+---------------+
| Python | JSON |
+===================+===============+
| dict | object |
+-------------------+---------------+
| list, tuple | array |
+-------------------+---------------+
| str, unicode | string |
+-------------------+---------------+
| int, long, float | number |
+-------------------+---------------+
| True | true |
+-------------------+---------------+
| False | false |
+-------------------+---------------+
| None | null |
+-------------------+---------------+
To extend this to recognize other objects, subclass and implement a
default method with another method that returns a serializable object
for ``o`` if possible, otherwise it should call the superclass implementation
(to raise TypeError).
If {skipkeys} is ``False`` (the default), then it is a TypeError to
attempt encoding of keys that are not str, int, long, float or None. If
{skipkeys} is ``True``, such items are simply skipped.
If {ensure_ascii} is ``True`` (the default), the output is guaranteed to be
str objects with all incoming unicode characters escaped. If
{ensure_ascii} is ``False``, the output will be a unicode object.
If {check_circular} is ``True`` (the default), then lists, dicts, and custom
encoded objects will be checked for circular references during encoding to
prevent an infinite recursion (which would cause an OverflowError).
Otherwise, no such check takes place.
If {allow_nan} is ``True`` (the default), then ``NaN``, ``Infinity``, and
``-Infinity`` will be encoded as such. This behavior is not JSON
specification compliant, but is consistent with most JavaScript based
encoders and decoders. Otherwise, it will be a ValueError to encode
such floats.
If {sort_keys} is ``True`` (the default), then the output of dictionaries
will be sorted by key; this is useful for regression tests to ensure that
JSON serializations can be compared on a day-to-day basis.
If {indent} is a non-negative integer (it is ``None`` by default), then JSON
array elements and object members will be pretty-printed with that indent
level. An indent level of 0 will only insert newlines. ``None`` is the most
compact representation.
If specified, {separators} should be an ``(item_separator, key_separator)``
tuple. The default is ``(', ', ': ')``. To get the most compact JSON
representation, you should specify ``(',', ':')`` to eliminate whitespace.
If specified, {default} is a function that gets called for objects that can't
otherwise be serialized. It should return a JSON encodable version of the
object or raise a TypeError.
If {encoding} is not ``None``, then all input strings will be transformed
into unicode using that encoding prior to JSON-encoding. The default is
UTF-8.
default(o)~
Implement this method in a subclass such that it returns a serializable
object for {o}, or calls the base implementation (to raise a
TypeError).
For example, to support arbitrary iterators, you could implement default
like this:: >
def default(self, o):
try:
iterable = iter(o)
except TypeError:
pass
else:
return list(iterable)
return JSONEncoder.default(self, o)
<
encode(o)~
Return a JSON string representation of a Python data structure, {o}. For
example:: >
>>> JSONEncoder().encode({"foo": ["bar", "baz"]})
'{"foo": ["bar", "baz"]}'
<
iterencode(o)~
Encode the given object, {o}, and yield each string representation as
available. For example:: >
for chunk in JSONEncoder().iterencode(bigobject):
mysocket.write(chunk)
==============================================================================
*py2stdlib-keyword*
keyword~
:synopsis: Test whether a string is a keyword in Python.
This module allows a Python program to determine if a string is a keyword.
iskeyword(s)~
Return true if {s} is a Python keyword.
kwlist~
Sequence containing all the keywords defined for the interpreter. If any
keywords are defined to only be active when particular __future__ (|py2stdlib-__future__|)
statements are in effect, these will be included as well.
==============================================================================
*py2stdlib-lib2to3*
lib2to3~
:synopsis: the 2to3 library
.. note::
The lib2to3 (|py2stdlib-lib2to3|) API should be considered unstable and may change
drastically in the future.
.. XXX What is the public interface anyway?
==============================================================================
*py2stdlib-linecache*
linecache~
:synopsis: This module provides random access to individual lines from text files.
The linecache (|py2stdlib-linecache|) module allows one to get any line from any file, while
attempting to optimize internally, using a cache, the common case where many
lines are read from a single file. This is used by the traceback (|py2stdlib-traceback|) module
to retrieve source lines for inclusion in the formatted traceback.
The linecache (|py2stdlib-linecache|) module defines the following functions:
getline(filename, lineno[, module_globals])~
Get line {lineno} from file named {filename}. This function will never throw an
exception --- it will return ``''`` on errors (the terminating newline character
will be included for lines that are found).
.. index:: triple: module; search; path
If a file named {filename} is not found, the function will look for it in the
module search path, ``sys.path``, after first checking for a 302
``__loader__`` in {module_globals}, in case the module was imported from a
zipfile or other non-filesystem import source.
.. versionadded:: 2.5
The {module_globals} parameter was added.
clearcache()~
Clear the cache. Use this function if you no longer need lines from files
previously read using getline.
checkcache([filename])~
Check the cache for validity. Use this function if files in the cache may have
changed on disk, and you require the updated version. If {filename} is omitted,
it will check all the entries in the cache.
Example:: >
>>> import linecache
>>> linecache.getline('/etc/passwd', 4)
'sys:x:3:3:sys:/dev:/bin/sh\n'
==============================================================================
*py2stdlib-locale*
locale~
:synopsis: Internationalization services.
The locale (|py2stdlib-locale|) module opens access to the POSIX locale database and
functionality. The POSIX locale mechanism allows programmers to deal with
certain cultural issues in an application, without requiring the programmer to
know all the specifics of each country where the software is executed.
.. index:: module: _locale
The locale (|py2stdlib-locale|) module is implemented on top of the _locale module,
which in turn uses an ANSI C locale implementation if available.
The locale (|py2stdlib-locale|) module defines the following exception and functions:
Error~
Exception raised when setlocale fails.
setlocale(category[, locale])~
If {locale} is specified, it may be a string, a tuple of the form ``(language
code, encoding)``, or ``None``. If it is a tuple, it is converted to a string
using the locale aliasing engine. If {locale} is given and not ``None``,
setlocale modifies the locale setting for the {category}. The available
categories are listed in the data description below. The value is the name of a
locale. An empty string specifies the user's default settings. If the
modification of the locale fails, the exception Error is raised. If
successful, the new locale setting is returned.
If {locale} is omitted or ``None``, the current setting for {category} is
returned.
setlocale is not thread safe on most systems. Applications typically
start with a call of :: >
import locale
locale.setlocale(locale.LC_ALL, '')
<
This sets the locale for all categories to the user's default setting (typically
specified in the LANG environment variable). If the locale is not
changed thereafter, using multithreading should not cause problems.
.. versionchanged:: 2.0
Added support for tuple values of the {locale} parameter.
localeconv()~
Returns the database of the local conventions as a dictionary. This dictionary
has the following strings as keys:
+----------------------+-------------------------------------+--------------------------------+
| Category | Key | Meaning |
+======================+=====================================+================================+
| LC_NUMERIC | ``'decimal_point'`` | Decimal point character. |
+----------------------+-------------------------------------+--------------------------------+
| | ``'grouping'`` | Sequence of numbers specifying |
| | | which relative positions the |
| | | ``'thousands_sep'`` is |
| | | expected. If the sequence is |
| | | terminated with |
| | | CHAR_MAX, no further |
| | | grouping is performed. If the |
| | | sequence terminates with a |
| | | ``0``, the last group size is |
| | | repeatedly used. |
+----------------------+-------------------------------------+--------------------------------+
| | ``'thousands_sep'`` | Character used between groups. |
+----------------------+-------------------------------------+--------------------------------+
| LC_MONETARY | ``'int_curr_symbol'`` | International currency symbol. |
+----------------------+-------------------------------------+--------------------------------+
| | ``'currency_symbol'`` | Local currency symbol. |
+----------------------+-------------------------------------+--------------------------------+
| | ``'p_cs_precedes/n_cs_precedes'`` | Whether the currency symbol |
| | | precedes the value (for |
| | | positive resp. negative |
| | | values). |
+----------------------+-------------------------------------+--------------------------------+
| | ``'p_sep_by_space/n_sep_by_space'`` | Whether the currency symbol is |
| | | separated from the value by a |
| | | space (for positive resp. |
| | | negative values). |
+----------------------+-------------------------------------+--------------------------------+
| | ``'mon_decimal_point'`` | Decimal point used for |
| | | monetary values. |
+----------------------+-------------------------------------+--------------------------------+
| | ``'frac_digits'`` | Number of fractional digits |
| | | used in local formatting of |
| | | monetary values. |
+----------------------+-------------------------------------+--------------------------------+
| | ``'int_frac_digits'`` | Number of fractional digits |
| | | used in international |
| | | formatting of monetary values. |
+----------------------+-------------------------------------+--------------------------------+
| | ``'mon_thousands_sep'`` | Group separator used for |
| | | monetary values. |
+----------------------+-------------------------------------+--------------------------------+
| | ``'mon_grouping'`` | Equivalent to ``'grouping'``, |
| | | used for monetary values. |
+----------------------+-------------------------------------+--------------------------------+
| | ``'positive_sign'`` | Symbol used to annotate a |
| | | positive monetary value. |
+----------------------+-------------------------------------+--------------------------------+
| | ``'negative_sign'`` | Symbol used to annotate a |
| | | negative monetary value. |
+----------------------+-------------------------------------+--------------------------------+
| | ``'p_sign_posn/n_sign_posn'`` | The position of the sign (for |
| | | positive resp. negative |
| | | values), see below. |
+----------------------+-------------------------------------+--------------------------------+
All numeric values can be set to CHAR_MAX to indicate that there is no
value specified in this locale.
The possible values for ``'p_sign_posn'`` and ``'n_sign_posn'`` are given below.
+--------------+-----------------------------------------+
| Value | Explanation |
+==============+=========================================+
| ``0`` | Currency and value are surrounded by |
| | parentheses. |
+--------------+-----------------------------------------+
| ``1`` | The sign should precede the value and |
| | currency symbol. |
+--------------+-----------------------------------------+
| ``2`` | The sign should follow the value and |
| | currency symbol. |
+--------------+-----------------------------------------+
| ``3`` | The sign should immediately precede the |
| | value. |
+--------------+-----------------------------------------+
| ``4`` | The sign should immediately follow the |
| | value. |
+--------------+-----------------------------------------+
| ``CHAR_MAX`` | Nothing is specified in this locale. |
+--------------+-----------------------------------------+
nl_langinfo(option)~
Return some locale-specific information as a string. This function is not
available on all systems, and the set of possible options might also vary
across platforms. The possible argument values are numbers, for which
symbolic constants are available in the locale module.
The nl_langinfo function accepts one of the following keys. Most
descriptions are taken from the corresponding description in the GNU C
library.
CODESET~
Get a string with the name of the character encoding used in the
selected locale.
D_T_FMT~
Get a string that can be used as a format string for strftime to
represent time and date in a locale-specific way.
D_FMT~
Get a string that can be used as a format string for strftime to
represent a date in a locale-specific way.
T_FMT~
Get a string that can be used as a format string for strftime to
represent a time in a locale-specific way.
T_FMT_AMPM~
Get a format string for strftime to represent time in the am/pm
format.
DAY_1 ... DAY_7~
Get the name of the n-th day of the week.
.. note:: >
This follows the US convention of DAY_1 being Sunday, not the
international convention (ISO 8601) that Monday is the first day of the
week.
<
ABDAY_1 ... ABDAY_7~
Get the abbreviated name of the n-th day of the week.
MON_1 ... MON_12~
Get the name of the n-th month.
ABMON_1 ... ABMON_12~
Get the abbreviated name of the n-th month.
RADIXCHAR~
Get the radix character (decimal dot, decimal comma, etc.)
THOUSEP~
Get the separator character for thousands (groups of three digits).
YESEXPR~
Get a regular expression that can be used with the regex function to
recognize a positive response to a yes/no question.
.. note:: >
The expression is in the syntax suitable for the regex function
from the C library, which might differ from the syntax used in re (|py2stdlib-re|).
<
NOEXPR~
Get a regular expression that can be used with the regex(3) function to
recognize a negative response to a yes/no question.
CRNCYSTR~
Get the currency symbol, preceded by "-" if the symbol should appear before
the value, "+" if the symbol should appear after the value, or "." if the
symbol should replace the radix character.
ERA~
Get a string that represents the era used in the current locale.
Most locales do not define this value. An example of a locale which does
define this value is the Japanese one. In Japan, the traditional
representation of dates includes the name of the era corresponding to the
then-emperor's reign.
Normally it should not be necessary to use this value directly. Specifying
the ``E`` modifier in their format strings causes the strftime
function to use this information. The format of the returned string is not
specified, and therefore you should not assume knowledge of it on different
systems.
ERA_YEAR~
Get the year in the relevant era of the locale.
ERA_D_T_FMT~
Get a format string for strftime to represent dates and times in a
locale-specific era-based way.
ERA_D_FMT~
Get a format string for strftime to represent time in a
locale-specific era-based way.
ALT_DIGITS~
Get a representation of up to 100 values used to represent the values
0 to 99.
getdefaultlocale([envvars])~
Tries to determine the default locale settings and returns them as a tuple of
the form ``(language code, encoding)``.
According to POSIX, a program which has not called ``setlocale(LC_ALL, '')``
runs using the portable ``'C'`` locale. Calling ``setlocale(LC_ALL, '')`` lets
it use the default locale as defined by the LANG variable. Since we
do not want to interfere with the current locale setting we thus emulate the
behavior in the way described above.
To maintain compatibility with other platforms, not only the LANG
variable is tested, but a list of variables given as envvars parameter. The
first found to be defined will be used. {envvars} defaults to the search path
used in GNU gettext; it must always contain the variable name ``LANG``. The GNU
gettext search path contains ``'LANGUAGE'``, ``'LC_ALL'``, ``'LC_CTYPE'``, and
``'LANG'``, in that order.
Except for the code ``'C'``, the language code corresponds to 1766.
{language code} and {encoding} may be ``None`` if their values cannot be
determined.
.. versionadded:: 2.0
getlocale([category])~
Returns the current setting for the given locale category as sequence containing
{language code}, {encoding}. {category} may be one of the LC_\* values
except LC_ALL. It defaults to LC_CTYPE.
Except for the code ``'C'``, the language code corresponds to 1766.
{language code} and {encoding} may be ``None`` if their values cannot be
determined.
.. versionadded:: 2.0
getpreferredencoding([do_setlocale])~
Return the encoding used for text data, according to user preferences. User
preferences are expressed differently on different systems, and might not be
available programmatically on some systems, so this function only returns a
guess.
On some systems, it is necessary to invoke setlocale to obtain the user
preferences, so this function is not thread-safe. If invoking setlocale is not
necessary or desired, {do_setlocale} should be set to ``False``.
.. versionadded:: 2.3
normalize(localename)~
Returns a normalized locale code for the given locale name. The returned locale
code is formatted for use with setlocale. If normalization fails, the
original name is returned unchanged.
If the given encoding is not known, the function defaults to the default
encoding for the locale code just like setlocale.
.. versionadded:: 2.0
resetlocale([category])~
Sets the locale for {category} to the default setting.
The default setting is determined by calling getdefaultlocale.
{category} defaults to LC_ALL.
.. versionadded:: 2.0
strcoll(string1, string2)~
Compares two strings according to the current LC_COLLATE setting. As
any other compare function, returns a negative, or a positive value, or ``0``,
depending on whether {string1} collates before or after {string2} or is equal to
it.
strxfrm(string)~
.. index:: builtin: cmp
Transforms a string to one that can be used for the built-in function
cmp, and still returns locale-aware results. This function can be used
when the same string is compared repeatedly, e.g. when collating a sequence of
strings.
format(format, val[, grouping[, monetary]])~
Formats a number {val} according to the current LC_NUMERIC setting.
The format follows the conventions of the ``%`` operator. For floating point
values, the decimal point is modified if appropriate. If {grouping} is true,
also takes the grouping into account.
If {monetary} is true, the conversion uses monetary thousands separator and
grouping strings.
Please note that this function will only work for exactly one %char specifier.
For whole format strings, use format_string.
.. versionchanged:: 2.5
Added the {monetary} parameter.
format_string(format, val[, grouping])~
Processes formatting specifiers as in ``format % val``, but takes the current
locale settings into account.
.. versionadded:: 2.5
currency(val[, symbol[, grouping[, international]]])~
Formats a number {val} according to the current LC_MONETARY settings.
The returned string includes the currency symbol if {symbol} is true, which is
the default. If {grouping} is true (which is not the default), grouping is done
with the value. If {international} is true (which is not the default), the
international currency symbol is used.
Note that this function will not work with the 'C' locale, so you have to set a
locale via setlocale first.
.. versionadded:: 2.5
str(float)~
Formats a floating point number using the same format as the built-in function
``str(float)``, but takes the decimal point into account.
atof(string)~
Converts a string to a floating point number, following the LC_NUMERIC
settings.
atoi(string)~
Converts a string to an integer, following the LC_NUMERIC conventions.
LC_CTYPE~
.. index:: module: string
Locale category for the character type functions. Depending on the settings of
this category, the functions of module string (|py2stdlib-string|) dealing with case change
their behaviour.
LC_COLLATE~
Locale category for sorting strings. The functions strcoll and
strxfrm of the locale (|py2stdlib-locale|) module are affected.
LC_TIME~
Locale category for the formatting of time. The function time.strftime
follows these conventions.
LC_MONETARY~
Locale category for formatting of monetary values. The available options are
available from the localeconv function.
LC_MESSAGES~
Locale category for message display. Python currently does not support
application specific locale-aware messages. Messages displayed by the operating
system, like those returned by os.strerror might be affected by this
category.
LC_NUMERIC~
Locale category for formatting numbers. The functions .format,
atoi, atof and .str of the locale (|py2stdlib-locale|) module are
affected by that category. All other numeric formatting operations are not
affected.
LC_ALL~
Combination of all locale settings. If this flag is used when the locale is
changed, setting the locale for all categories is attempted. If that fails for
any category, no category is changed at all. When the locale is retrieved using
this flag, a string indicating the setting for all categories is returned. This
string can be later used to restore the settings.
CHAR_MAX~
This is a symbolic constant used for different values returned by
localeconv.
Example:: >
>>> import locale
>>> loc = locale.getlocale() # get current locale
# use German locale; name might vary with platform
>>> locale.setlocale(locale.LC_ALL, 'de_DE')
>>> locale.strcoll('f\xe4n', 'foo') # compare a string containing an umlaut
>>> locale.setlocale(locale.LC_ALL, '') # use user's preferred locale
>>> locale.setlocale(locale.LC_ALL, 'C') # use default (C) locale
>>> locale.setlocale(locale.LC_ALL, loc) # restore saved locale
<
Background, details, hints, tips and caveats
The C standard defines the locale as a program-wide property that may be
relatively expensive to change. On top of that, some implementation are broken
in such a way that frequent locale changes may cause core dumps. This makes the
locale somewhat painful to use correctly.
Initially, when a program is started, the locale is the ``C`` locale, no matter
what the user's preferred locale is. The program must explicitly say that it
wants the user's preferred locale settings by calling ``setlocale(LC_ALL, '')``.
It is generally a bad idea to call setlocale in some library routine,
since as a side effect it affects the entire program. Saving and restoring it
is almost as bad: it is expensive and affects other threads that happen to run
before the settings have been restored.
If, when coding a module for general use, you need a locale independent version
of an operation that is affected by the locale (such as string.lower, or
certain formats used with time.strftime), you will have to find a way to
do it without using the standard library routine. Even better is convincing
yourself that using locale settings is okay. Only as a last resort should you
document that your module is not compatible with non-\ ``C`` locale settings.
.. index:: module: string
The case conversion functions in the string (|py2stdlib-string|) module are affected by the
locale settings. When a call to the setlocale function changes the
LC_CTYPE settings, the variables ``string.lowercase``,
``string.uppercase`` and ``string.letters`` are recalculated. Note that code
that uses these variable through 'from ... import ...',
e.g. ``from string import letters``, is not affected by subsequent
setlocale calls.
The only way to perform numeric operations according to the locale is to use the
special functions defined by this module: atof, atoi,
.format, .str.
For extension writers and programs that embed Python
----------------------------------------------------
Extension modules should never call setlocale, except to find out what
the current locale is. But since the return value can only be used portably to
restore it, that is not very useful (except perhaps to find out whether or not
the locale is ``C``).
When Python code uses the locale (|py2stdlib-locale|) module to change the locale, this also
affects the embedding application. If the embedding application doesn't want
this to happen, it should remove the _locale extension module (which does
all the work) from the table of built-in modules in the config.c file,
and make sure that the _locale module is not accessible as a shared
library.
Access to message catalogs
--------------------------
The locale module exposes the C library's gettext interface on systems that
provide this interface. It consists of the functions gettext (|py2stdlib-gettext|),
dgettext, dcgettext, textdomain, bindtextdomain,
and bind_textdomain_codeset. These are similar to the same functions in
the gettext (|py2stdlib-gettext|) module, but use the C library's binary format for message
catalogs, and the C library's search algorithms for locating message catalogs.
Python applications should normally find no need to invoke these functions, and
should use gettext (|py2stdlib-gettext|) instead. A known exception to this rule are
applications that link use additional C libraries which internally invoke
gettext (|py2stdlib-gettext|) or dcgettext. For these applications, it may be
necessary to bind the text domain, so that the libraries can properly locate
their message catalogs.
==============================================================================
*py2stdlib-logging*
logging~
:synopsis: Flexible error logging system for applications.
.. index:: pair: Errors; logging
.. versionadded:: 2.3
This module defines functions and classes which implement a flexible error
logging system for applications.
Logging is performed by calling methods on instances of the Logger
class (hereafter called loggers). Each instance has a name, and they are
conceptually arranged in a namespace hierarchy using dots (periods) as
separators. For example, a logger named "scan" is the parent of loggers
"scan.text", "scan.html" and "scan.pdf". Logger names can be anything you want,
and indicate the area of an application in which a logged message originates.
Logged messages also have levels of importance associated with them. The default
levels provided are DEBUG, INFO, WARNING,
ERROR and CRITICAL. As a convenience, you indicate the
importance of a logged message by calling an appropriate method of
Logger. The methods are debug, info, warning,
error and critical, which mirror the default levels. You are not
constrained to use these levels: you can specify your own and use a more general
Logger method, log, which takes an explicit level argument.
Logging tutorial
----------------
The key benefit of having the logging API provided by a standard library module
is that all Python modules can participate in logging, so your application log
can include messages from third-party modules.
It is, of course, possible to log messages with different verbosity levels or to
different destinations. Support for writing log messages to files, HTTP
GET/POST locations, email via SMTP, generic sockets, or OS-specific logging
mechanisms are all supported by the standard module. You can also create your
own log destination class if you have special requirements not met by any of the
built-in classes.
Simple examples
^^^^^^^^^^^^^^^
.. (see <http://blog.doughellmann.com/2007/05/pymotw-logging.html>)
Most applications are probably going to want to log to a file, so let's start
with that case. Using the basicConfig function, we can set up the
default handler so that debug messages are written to a file (in the example,
we assume that you have the appropriate permissions to create a file called
{example.log} in the current directory):: >
import logging
LOG_FILENAME = 'example.log'
logging.basicConfig(filename=LOG_FILENAME,level=logging.DEBUG)
logging.debug('This message should go to the log file')
<
And now if we open the file and look at what we have, we should find the log
message:: >
DEBUG:root:This message should go to the log file
<
If you run the script repeatedly, the additional log messages are appended to
the file. To create a new file each time, you can pass a {filemode} argument to
basicConfig with a value of ``'w'``. Rather than managing the file size
yourself, though, it is simpler to use a RotatingFileHandler:: >
import glob
import logging
import logging.handlers
LOG_FILENAME = 'logging_rotatingfile_example.out'
# Set up a specific logger with our desired output level
my_logger = logging.getLogger('MyLogger')
my_logger.setLevel(logging.DEBUG)
# Add the log message handler to the logger
handler = logging.handlers.RotatingFileHandler(
LOG_FILENAME, maxBytes=20, backupCount=5)
my_logger.addHandler(handler)
# Log some messages
for i in range(20):
my_logger.debug('i = %d' % i)
# See what files are created
logfiles = glob.glob('%s*' % LOG_FILENAME)
for filename in logfiles:
print filename
<
The result should be 6 separate files, each with part of the log history for the
application:: >
logging_rotatingfile_example.out
logging_rotatingfile_example.out.1
logging_rotatingfile_example.out.2
logging_rotatingfile_example.out.3
logging_rotatingfile_example.out.4
logging_rotatingfile_example.out.5
<
The most current file is always logging_rotatingfile_example.out,
and each time it reaches the size limit it is renamed with the suffix
``.1``. Each of the existing backup files is renamed to increment the suffix
(``.1`` becomes ``.2``, etc.) and the ``.6`` file is erased.
Obviously this example sets the log length much much too small as an extreme
example. You would want to set {maxBytes} to an appropriate value.
Another useful feature of the logging API is the ability to produce different
messages at different log levels. This allows you to instrument your code with
debug messages, for example, but turning the log level down so that those debug
messages are not written for your production system. The default levels are
``NOTSET``, ``DEBUG``, ``INFO``, ``WARNING``, ``ERROR`` and ``CRITICAL``.
The logger, handler, and log message call each specify a level. The log message
is only emitted if the handler and logger are configured to emit messages of
that level or lower. For example, if a message is ``CRITICAL``, and the logger
is set to ``ERROR``, the message is emitted. If a message is a ``WARNING``, and
the logger is set to produce only ``ERROR``\s, the message is not emitted:: >
import logging
import sys
LEVELS = {'debug': logging.DEBUG,
'info': logging.INFO,
'warning': logging.WARNING,
'error': logging.ERROR,
'critical': logging.CRITICAL}
if len(sys.argv) > 1:
level_name = sys.argv[1]
level = LEVELS.get(level_name, logging.NOTSET)
logging.basicConfig(level=level)
logging.debug('This is a debug message')
logging.info('This is an info message')
logging.warning('This is a warning message')
logging.error('This is an error message')
logging.critical('This is a critical error message')
<
Run the script with an argument like 'debug' or 'warning' to see which messages
show up at different levels:: >
$ python logging_level_example.py debug
DEBUG:root:This is a debug message
INFO:root:This is an info message
WARNING:root:This is a warning message
ERROR:root:This is an error message
CRITICAL:root:This is a critical error message
$ python logging_level_example.py info
INFO:root:This is an info message
WARNING:root:This is a warning message
ERROR:root:This is an error message
CRITICAL:root:This is a critical error message
<
You will notice that these log messages all have ``root`` embedded in them. The
logging module supports a hierarchy of loggers with different names. An easy
way to tell where a specific log message comes from is to use a separate logger
object for each of your modules. Each new logger "inherits" the configuration
of its parent, and log messages sent to a logger include the name of that
logger. Optionally, each logger can be configured differently, so that messages
from different modules are handled in different ways. Let's look at a simple
example of how to log from different modules so it is easy to trace the source
of the message:: >
import logging
logging.basicConfig(level=logging.WARNING)
logger1 = logging.getLogger('package1.module1')
logger2 = logging.getLogger('package2.module2')
logger1.warning('This message comes from one module')
logger2.warning('And this message comes from another module')
<
And the output::
$ python logging_modules_example.py
WARNING:package1.module1:This message comes from one module
WARNING:package2.module2:And this message comes from another module
There are many more options for configuring logging, including different log
message formatting options, having messages delivered to multiple destinations,
and changing the configuration of a long-running application on the fly using a
socket interface. All of these options are covered in depth in the library
module documentation.
Loggers
^^^^^^^
The logging library takes a modular approach and offers the several categories
of components: loggers, handlers, filters, and formatters. Loggers expose the
interface that application code directly uses. Handlers send the log records to
the appropriate destination. Filters provide a finer grained facility for
determining which log records to send on to a handler. Formatters specify the
layout of the resultant log record.
Logger objects have a threefold job. First, they expose several
methods to application code so that applications can log messages at runtime.
Second, logger objects determine which log messages to act upon based upon
severity (the default filtering facility) or filter objects. Third, logger
objects pass along relevant log messages to all interested log handlers.
The most widely used methods on logger objects fall into two categories:
configuration and message sending.
* Logger.setLevel specifies the lowest-severity log message a logger
will handle, where debug is the lowest built-in severity level and critical is
the highest built-in severity. For example, if the severity level is info,
the logger will handle only info, warning, error, and critical messages and
will ignore debug messages.
* Logger.addFilter and Logger.removeFilter add and remove filter
objects from the logger object. This tutorial does not address filters.
With the logger object configured, the following methods create log messages:
* Logger.debug, Logger.info, Logger.warning,
Logger.error, and Logger.critical all create log records with
a message and a level that corresponds to their respective method names. The
message is actually a format string, which may contain the standard string
substitution syntax of %s, %d, %f, and so on. The
rest of their arguments is a list of objects that correspond with the
substitution fields in the message. With regard to {}kwargs, the
logging methods care only about a keyword of exc_info and use it to
determine whether to log exception information.
* Logger.exception creates a log message similar to
Logger.error. The difference is that Logger.exception dumps a
stack trace along with it. Call this method only from an exception handler.
* Logger.log takes a log level as an explicit argument. This is a
little more verbose for logging messages than using the log level convenience
methods listed above, but this is how to log at custom log levels.
getLogger returns a reference to a logger instance with the specified
if it is provided, or ``root`` if not. The names are period-separated
hierarchical structures. Multiple calls to getLogger with the same name
will return a reference to the same logger object. Loggers that are further
down in the hierarchical list are children of loggers higher up in the list.
For example, given a logger with a name of ``foo``, loggers with names of
``foo.bar``, ``foo.bar.baz``, and ``foo.bam`` are all descendants of ``foo``.
Child loggers propagate messages up to the handlers associated with their
ancestor loggers. Because of this, it is unnecessary to define and configure
handlers for all the loggers an application uses. It is sufficient to
configure handlers for a top-level logger and create child loggers as needed.
Handlers
^^^^^^^^
Handler objects are responsible for dispatching the appropriate log
messages (based on the log messages' severity) to the handler's specified
destination. Logger objects can add zero or more handler objects to themselves
with an addHandler method. As an example scenario, an application may
want to send all log messages to a log file, all log messages of error or higher
to stdout, and all messages of critical to an email address. This scenario
requires three individual handlers where each handler is responsible for sending
messages of a specific severity to a specific location.
The standard library includes quite a few handler types; this tutorial uses only
StreamHandler and FileHandler in its examples.
There are very few methods in a handler for application developers to concern
themselves with. The only handler methods that seem relevant for application
developers who are using the built-in handler objects (that is, not creating
custom handlers) are the following configuration methods:
* The Handler.setLevel method, just as in logger objects, specifies the
lowest severity that will be dispatched to the appropriate destination. Why
are there two setLevel methods? The level set in the logger
determines which severity of messages it will pass to its handlers. The level
set in each handler determines which messages that handler will send on.
* setFormatter selects a Formatter object for this handler to use.
* addFilter and removeFilter respectively configure and
deconfigure filter objects on handlers.
Application code should not directly instantiate and use instances of
Handler. Instead, the Handler class is a base class that
defines the interface that all handlers should have and establishes some
default behavior that child classes can use (or override).
Formatters
^^^^^^^^^^
Formatter objects configure the final order, structure, and contents of the log
message. Unlike the base logging.Handler class, application code may
instantiate formatter classes, although you could likely subclass the formatter
if your application needs special behavior. The constructor takes two optional
arguments: a message format string and a date format string. If there is no
message format string, the default is to use the raw message. If there is no
date format string, the default date format is:: >
%Y-%m-%d %H:%M:%S
<
with the milliseconds tacked on at the end.
The message format string uses ``%(<dictionary key>)s`` styled string
substitution; the possible keys are documented in formatter (|py2stdlib-formatter|).
The following message format string will log the time in a human-readable
format, the severity of the message, and the contents of the message, in that
order:: >
"%(asctime)s - %(levelname)s - %(message)s"
<
Configuring Logging
Programmers can configure logging in three ways:
1. Creating loggers, handlers, and formatters explicitly using Python
code that calls the configuration methods listed above.
2. Creating a logging config file and reading it using the fileConfig
function.
3. Creating a dictionary of configuration information and passing it
to the dictConfig function.
The following example configures a very simple logger, a console
handler, and a simple formatter using Python code:: >
import logging
# create logger
logger = logging.getLogger("simple_example")
logger.setLevel(logging.DEBUG)
# create console handler and set level to debug
ch = logging.StreamHandler()
ch.setLevel(logging.DEBUG)
# create formatter
formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
# add formatter to ch
ch.setFormatter(formatter)
# add ch to logger
logger.addHandler(ch)
# "application" code
logger.debug("debug message")
logger.info("info message")
logger.warn("warn message")
logger.error("error message")
logger.critical("critical message")
<
Running this module from the command line produces the following output::
$ python simple_logging_module.py
2005-03-19 15:10:26,618 - simple_example - DEBUG - debug message
2005-03-19 15:10:26,620 - simple_example - INFO - info message
2005-03-19 15:10:26,695 - simple_example - WARNING - warn message
2005-03-19 15:10:26,697 - simple_example - ERROR - error message
2005-03-19 15:10:26,773 - simple_example - CRITICAL - critical message
The following Python module creates a logger, handler, and formatter nearly
identical to those in the example listed above, with the only difference being
the names of the objects:: >
import logging
import logging.config
logging.config.fileConfig("logging.conf")
# create logger
logger = logging.getLogger("simpleExample")
# "application" code
logger.debug("debug message")
logger.info("info message")
logger.warn("warn message")
logger.error("error message")
logger.critical("critical message")
<
Here is the logging.conf file::
[loggers]
keys=root,simpleExample
[handlers]
keys=consoleHandler
[formatters]
keys=simpleFormatter
[logger_root]
level=DEBUG
handlers=consoleHandler
[logger_simpleExample]
level=DEBUG
handlers=consoleHandler
qualname=simpleExample
propagate=0
[handler_consoleHandler]
class=StreamHandler
level=DEBUG
formatter=simpleFormatter
args=(sys.stdout,)
[formatter_simpleFormatter]
format=%(asctime)s - %(name)s - %(levelname)s - %(message)s
datefmt=
The output is nearly identical to that of the non-config-file-based example:: >
$ python simple_logging_config.py
2005-03-19 15:38:55,977 - simpleExample - DEBUG - debug message
2005-03-19 15:38:55,979 - simpleExample - INFO - info message
2005-03-19 15:38:56,054 - simpleExample - WARNING - warn message
2005-03-19 15:38:56,055 - simpleExample - ERROR - error message
2005-03-19 15:38:56,130 - simpleExample - CRITICAL - critical message
<
You can see that the config file approach has a few advantages over the Python
code approach, mainly separation of configuration and code and the ability of
noncoders to easily modify the logging properties.
Note that the class names referenced in config files need to be either relative
to the logging module, or absolute values which can be resolved using normal
import mechanisms. Thus, you could use either handlers.WatchedFileHandler
(relative to the logging module) or mypackage.mymodule.MyHandler (for a
class defined in package mypackage and module mymodule, where
mypackage is available on the Python import path).
.. versionchanged:: 2.7
In Python 2.7, a new means of configuring logging has been introduced, using
dictionaries to hold configuration information. This provides a superset of the
functionality of the config-file-based approach outlined above, and is the
recommended configuration method for new applications and deployments. Because
a Python dictionary is used to hold configuration information, and since you
can populate that dictionary using different means, you have more options for
configuration. For example, you can use a configuration file in JSON format,
or, if you have access to YAML processing functionality, a file in YAML
format, to populate the configuration dictionary. Or, of course, you can
construct the dictionary in Python code, receive it in pickled form over a
socket, or use whatever approach makes sense for your application.
Here's an example of the same configuration as above, in YAML format for
the new dictionary-based approach:: >
version: 1
formatters:
simple:
format: format=%(asctime)s - %(name)s - %(levelname)s - %(message)s
handlers:
console:
class: logging.StreamHandler
level: DEBUG
formatter: simple
stream: ext://sys.stdout
loggers:
simpleExample:
level: DEBUG
handlers: [console]
propagate: no
root:
level: DEBUG
handlers: [console]
<
For more information about logging using a dictionary, see
logging-config-api.
Configuring Logging for a Library
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
When developing a library which uses logging, some consideration needs to be
given to its configuration. If the using application does not use logging, and
library code makes logging calls, then a one-off message "No handlers could be
found for logger X.Y.Z" is printed to the console. This message is intended
to catch mistakes in logging configuration, but will confuse an application
developer who is not aware of logging by the library.
In addition to documenting how a library uses logging, a good way to configure
library logging so that it does not cause a spurious message is to add a
handler which does nothing. This avoids the message being printed, since a
handler will be found: it just doesn't produce any output. If the library user
configures logging for application use, presumably that configuration will add
some handlers, and if levels are suitably configured then logging calls made
in library code will send output to those handlers, as normal.
A do-nothing handler can be simply defined as follows:: >
import logging
class NullHandler(logging.Handler):
def emit(self, record):
pass
<
An instance of this handler should be added to the top-level logger of the
logging namespace used by the library. If all logging by a library {foo} is
done using loggers with names matching "foo.x.y", then the code:: >
import logging
h = NullHandler()
logging.getLogger("foo").addHandler(h)
<
should have the desired effect. If an organisation produces a number of
libraries, then the logger name specified can be "orgname.foo" rather than
just "foo".
.. versionadded:: 2.7
The NullHandler class was not present in previous versions, but is now
included, so that it need not be defined in library code.
Logging Levels
--------------
The numeric values of logging levels are given in the following table. These are
primarily of interest if you want to define your own levels, and need them to
have specific values relative to the predefined levels. If you define a level
with the same numeric value, it overwrites the predefined value; the predefined
name is lost.
+--------------+---------------+
| Level | Numeric value |
+==============+===============+
| ``CRITICAL`` | 50 |
+--------------+---------------+
| ``ERROR`` | 40 |
+--------------+---------------+
| ``WARNING`` | 30 |
+--------------+---------------+
| ``INFO`` | 20 |
+--------------+---------------+
| ``DEBUG`` | 10 |
+--------------+---------------+
| ``NOTSET`` | 0 |
+--------------+---------------+
Levels can also be associated with loggers, being set either by the developer or
through loading a saved logging configuration. When a logging method is called
on a logger, the logger compares its own level with the level associated with
the method call. If the logger's level is higher than the method call's, no
logging message is actually generated. This is the basic mechanism controlling
the verbosity of logging output.
Logging messages are encoded as instances of the LogRecord class. When
a logger decides to actually log an event, a LogRecord instance is
created from the logging message.
Logging messages are subjected to a dispatch mechanism through the use of
handlers, which are instances of subclasses of the Handler
class. Handlers are responsible for ensuring that a logged message (in the form
of a LogRecord) ends up in a particular location (or set of locations)
which is useful for the target audience for that message (such as end users,
support desk staff, system administrators, developers). Handlers are passed
LogRecord instances intended for particular destinations. Each logger
can have zero, one or more handlers associated with it (via the
addHandler method of Logger). In addition to any handlers
directly associated with a logger, *all handlers associated with all ancestors
of the logger{ are called to dispatch the message (unless the }propagate* flag
for a logger is set to a false value, at which point the passing to ancestor
handlers stops).
Just as for loggers, handlers can have levels associated with them. A handler's
level acts as a filter in the same way as a logger's level does. If a handler
decides to actually dispatch an event, the emit method is used to send
the message to its destination. Most user-defined subclasses of Handler
will need to override this emit.
Useful Handlers
---------------
In addition to the base Handler class, many useful subclasses are
provided:
#. stream-handler instances send error messages to streams (file-like
objects).
#. file-handler instances send error messages to disk files.
#. BaseRotatingHandler is the base class for handlers that
rotate log files at a certain point. It is not meant to be instantiated
directly. Instead, use rotating-file-handler or
timed-rotating-file-handler.
#. rotating-file-handler instances send error messages to disk
files, with support for maximum log file sizes and log file rotation.
#. timed-rotating-file-handler instances send error messages to
disk files, rotating the log file at certain timed intervals.
#. socket-handler instances send error messages to TCP/IP
sockets.
#. datagram-handler instances send error messages to UDP
sockets.
#. smtp-handler instances send error messages to a designated
email address.
#. syslog-handler instances send error messages to a Unix
syslog daemon, possibly on a remote machine.
#. nt-eventlog-handler instances send error messages to a
Windows NT/2000/XP event log.
#. memory-handler instances send error messages to a buffer
in memory, which is flushed whenever specific criteria are met.
#. http-handler instances send error messages to an HTTP
server using either ``GET`` or ``POST`` semantics.
#. watched-file-handler instances watch the file they are
logging to. If the file changes, it is closed and reopened using the file
name. This handler is only useful on Unix-like systems; Windows does not
support the underlying mechanism used.
#. null-handler instances do nothing with error messages. They are used
by library developers who want to use logging, but want to avoid the "No
handlers could be found for logger XXX" message which can be displayed if
the library user has not configured logging. See library-config for
more information.
.. versionadded:: 2.7
The NullHandler class was not present in previous versions.
The NullHandler, StreamHandler and FileHandler
classes are defined in the core logging package. The other handlers are
defined in a sub- module, logging.handlers. (There is also another
sub-module, logging.config, for configuration functionality.)
Logged messages are formatted for presentation through instances of the
Formatter class. They are initialized with a format string suitable for
use with the % operator and a dictionary.
For formatting multiple messages in a batch, instances of
BufferingFormatter can be used. In addition to the format string (which
is applied to each message in the batch), there is provision for header and
trailer format strings.
When filtering based on logger level and/or handler level is not enough,
instances of Filter can be added to both Logger and
Handler instances (through their addFilter method). Before
deciding to process a message further, both loggers and handlers consult all
their filters for permission. If any filter returns a false value, the message
is not processed further.
The basic Filter functionality allows filtering by specific logger
name. If this feature is used, messages sent to the named logger and its
children are allowed through the filter, and all others dropped.
Module-Level Functions
----------------------
In addition to the classes described above, there are a number of module- level
functions.
getLogger([name])~
Return a logger with the specified name or, if no name is specified, return a
logger which is the root logger of the hierarchy. If specified, the name is
typically a dot-separated hierarchical name like {"a"}, {"a.b"} or {"a.b.c.d"}.
Choice of these names is entirely up to the developer who is using logging.
All calls to this function with a given name return the same logger instance.
This means that logger instances never need to be passed between different parts
of an application.
getLoggerClass()~
Return either the standard Logger class, or the last class passed to
setLoggerClass. This function may be called from within a new class
definition, to ensure that installing a customised Logger class will
not undo customisations already applied by other code. For example:: >
class MyLogger(logging.getLoggerClass()):
# ... override behaviour here
<
debug(msg[, {args[, }*kwargs]])~
Logs a message with level DEBUG on the root logger. The {msg} is the
message format string, and the {args} are the arguments which are merged into
{msg} using the string formatting operator. (Note that this means that you can
use keywords in the format string, together with a single dictionary argument.)
There are two keyword arguments in {kwargs} which are inspected: {exc_info}
which, if it does not evaluate as false, causes exception information to be
added to the logging message. If an exception tuple (in the format returned by
sys.exc_info) is provided, it is used; otherwise, sys.exc_info
is called to get the exception information.
The other optional keyword argument is {extra} which can be used to pass a
dictionary which is used to populate the __dict__ of the LogRecord created for
the logging event with user-defined attributes. These custom attributes can then
be used as you like. For example, they could be incorporated into logged
messages. For example:: >
FORMAT = "%(asctime)-15s %(clientip)s %(user)-8s %(message)s"
logging.basicConfig(format=FORMAT)
d = {'clientip': '192.168.0.1', 'user': 'fbloggs'}
logging.warning("Protocol problem: %s", "connection reset", extra=d)
<
would print something like ::
2006-02-08 22:20:02,165 192.168.0.1 fbloggs Protocol problem: connection reset
The keys in the dictionary passed in {extra} should not clash with the keys used
by the logging system. (See the Formatter documentation for more
information on which keys are used by the logging system.)
If you choose to use these attributes in logged messages, you need to exercise
some care. In the above example, for instance, the Formatter has been
set up with a format string which expects 'clientip' and 'user' in the attribute
dictionary of the LogRecord. If these are missing, the message will not be
logged because a string formatting exception will occur. So in this case, you
always need to pass the {extra} dictionary with these keys.
While this might be annoying, this feature is intended for use in specialized
circumstances, such as multi-threaded servers where the same code executes in
many contexts, and interesting conditions which arise are dependent on this
context (such as remote client IP address and authenticated user name, in the
above example). In such circumstances, it is likely that specialized
Formatter\ s would be used with particular Handler\ s.
.. versionchanged:: 2.5
{extra} was added.
info(msg[, {args[, }*kwargs]])~
Logs a message with level INFO on the root logger. The arguments are
interpreted as for debug.
warning(msg[, {args[, }*kwargs]])~
Logs a message with level WARNING on the root logger. The arguments are
interpreted as for debug.
error(msg[, {args[, }*kwargs]])~
Logs a message with level ERROR on the root logger. The arguments are
interpreted as for debug.
critical(msg[, {args[, }*kwargs]])~
Logs a message with level CRITICAL on the root logger. The arguments
are interpreted as for debug.
exception(msg[, *args])~
Logs a message with level ERROR on the root logger. The arguments are
interpreted as for debug. Exception info is added to the logging
message. This function should only be called from an exception handler.
log(level, msg[, {args[, }*kwargs]])~
Logs a message with level {level} on the root logger. The other arguments are
interpreted as for debug.
disable(lvl)~
Provides an overriding level {lvl} for all loggers which takes precedence over
the logger's own level. When the need arises to temporarily throttle logging
output down across the whole application, this function can be useful. Its
effect is to disable all logging calls of severity {lvl} and below, so that
if you call it with a value of INFO, then all INFO and DEBUG events would be
discarded, whereas those of severity WARNING and above would be processed
according to the logger's effective level.
addLevelName(lvl, levelName)~
Associates level {lvl} with text {levelName} in an internal dictionary, which is
used to map numeric levels to a textual representation, for example when a
Formatter formats a message. This function can also be used to define
your own levels. The only constraints are that all levels used must be
registered using this function, levels should be positive integers and they
should increase in increasing order of severity.
getLevelName(lvl)~
Returns the textual representation of logging level {lvl}. If the level is one
of the predefined levels CRITICAL, ERROR, WARNING,
INFO or DEBUG then you get the corresponding string. If you
have associated levels with names using addLevelName then the name you
have associated with {lvl} is returned. If a numeric value corresponding to one
of the defined levels is passed in, the corresponding string representation is
returned. Otherwise, the string "Level %s" % lvl is returned.
makeLogRecord(attrdict)~
Creates and returns a new LogRecord instance whose attributes are
defined by {attrdict}. This function is useful for taking a pickled
LogRecord attribute dictionary, sent over a socket, and reconstituting
it as a LogRecord instance at the receiving end.
basicConfig([{}kwargs])~
Does basic configuration for the logging system by creating a
StreamHandler with a default Formatter and adding it to the
root logger. The functions debug, info, warning,
error and critical will call basicConfig automatically
if no handlers are defined for the root logger.
This function does nothing if the root logger already has handlers
configured for it.
.. versionchanged:: 2.4
Formerly, basicConfig did not take any keyword arguments.
The following keyword arguments are supported.
+--------------+---------------------------------------------+
| Format | Description |
+==============+=============================================+
| ``filename`` | Specifies that a FileHandler be created, |
| | using the specified filename, rather than a |
| | StreamHandler. |
+--------------+---------------------------------------------+
| ``filemode`` | Specifies the mode to open the file, if |
| | filename is specified (if filemode is |
| | unspecified, it defaults to 'a'). |
+--------------+---------------------------------------------+
| ``format`` | Use the specified format string for the |
| | handler. |
+--------------+---------------------------------------------+
| ``datefmt`` | Use the specified date/time format. |
+--------------+---------------------------------------------+
| ``level`` | Set the root logger level to the specified |
| | level. |
+--------------+---------------------------------------------+
| ``stream`` | Use the specified stream to initialize the |
| | StreamHandler. Note that this argument is |
| | incompatible with 'filename' - if both are |
| | present, 'stream' is ignored. |
+--------------+---------------------------------------------+
shutdown()~
Informs the logging system to perform an orderly shutdown by flushing and
closing all handlers. This should be called at application exit and no
further use of the logging system should be made after this call.
setLoggerClass(klass)~
Tells the logging system to use the class {klass} when instantiating a logger.
The class should define __init__ such that only a name argument is
required, and the __init__ should call Logger.__init__. This
function is typically called before any loggers are instantiated by applications
which need to use custom logger behavior.
.. seealso::
282 - A Logging System
The proposal which described this feature for inclusion in the Python standard
library.
`Original Python logging package <http://www.red-dove.com/python_logging.html>`_
This is the original source for the logging (|py2stdlib-logging|) package. The version of the
package available from this site is suitable for use with Python 1.5.2, 2.1.x
and 2.2.x, which do not include the logging (|py2stdlib-logging|) package in the standard
library.
Logger Objects
--------------
Loggers have the following attributes and methods. Note that Loggers are never
instantiated directly, but always through the module-level function
``logging.getLogger(name)``.
Logger.propagate~
If this evaluates to false, logging messages are not passed by this logger or by
its child loggers to the handlers of higher level (ancestor) loggers. The
constructor sets this attribute to 1.
Logger.setLevel(lvl)~
Sets the threshold for this logger to {lvl}. Logging messages which are less
severe than {lvl} will be ignored. When a logger is created, the level is set to
NOTSET (which causes all messages to be processed when the logger is
the root logger, or delegation to the parent when the logger is a non-root
logger). Note that the root logger is created with level WARNING.
The term "delegation to the parent" means that if a logger has a level of
NOTSET, its chain of ancestor loggers is traversed until either an ancestor with
a level other than NOTSET is found, or the root is reached.
If an ancestor is found with a level other than NOTSET, then that ancestor's
level is treated as the effective level of the logger where the ancestor search
began, and is used to determine how a logging event is handled.
If the root is reached, and it has a level of NOTSET, then all messages will be
processed. Otherwise, the root's level will be used as the effective level.
Logger.isEnabledFor(lvl)~
Indicates if a message of severity {lvl} would be processed by this logger.
This method checks first the module-level level set by
``logging.disable(lvl)`` and then the logger's effective level as determined
by getEffectiveLevel.
Logger.getEffectiveLevel()~
Indicates the effective level for this logger. If a value other than
NOTSET has been set using setLevel, it is returned. Otherwise,
the hierarchy is traversed towards the root until a value other than
NOTSET is found, and that value is returned.
Logger.getChild(suffix)~
Returns a logger which is a descendant to this logger, as determined by the suffix.
Thus, ``logging.getLogger('abc').getChild('def.ghi')`` would return the same
logger as would be returned by ``logging.getLogger('abc.def.ghi')``. This is a
convenience method, useful when the parent logger is named using e.g. ``__name__``
rather than a literal string.
.. versionadded:: 2.7
Logger.debug(msg[, {args[, }*kwargs]])~
Logs a message with level DEBUG on this logger. The {msg} is the
message format string, and the {args} are the arguments which are merged into
{msg} using the string formatting operator. (Note that this means that you can
use keywords in the format string, together with a single dictionary argument.)
There are two keyword arguments in {kwargs} which are inspected: {exc_info}
which, if it does not evaluate as false, causes exception information to be
added to the logging message. If an exception tuple (in the format returned by
sys.exc_info) is provided, it is used; otherwise, sys.exc_info
is called to get the exception information.
The other optional keyword argument is {extra} which can be used to pass a
dictionary which is used to populate the __dict__ of the LogRecord created for
the logging event with user-defined attributes. These custom attributes can then
be used as you like. For example, they could be incorporated into logged
messages. For example:: >
FORMAT = "%(asctime)-15s %(clientip)s %(user)-8s %(message)s"
logging.basicConfig(format=FORMAT)
d = { 'clientip' : '192.168.0.1', 'user' : 'fbloggs' }
logger = logging.getLogger("tcpserver")
logger.warning("Protocol problem: %s", "connection reset", extra=d)
<
would print something like ::
2006-02-08 22:20:02,165 192.168.0.1 fbloggs Protocol problem: connection reset
The keys in the dictionary passed in {extra} should not clash with the keys used
by the logging system. (See the Formatter documentation for more
information on which keys are used by the logging system.)
If you choose to use these attributes in logged messages, you need to exercise
some care. In the above example, for instance, the Formatter has been
set up with a format string which expects 'clientip' and 'user' in the attribute
dictionary of the LogRecord. If these are missing, the message will not be
logged because a string formatting exception will occur. So in this case, you
always need to pass the {extra} dictionary with these keys.
While this might be annoying, this feature is intended for use in specialized
circumstances, such as multi-threaded servers where the same code executes in
many contexts, and interesting conditions which arise are dependent on this
context (such as remote client IP address and authenticated user name, in the
above example). In such circumstances, it is likely that specialized
Formatter\ s would be used with particular Handler\ s.
.. versionchanged:: 2.5
{extra} was added.
Logger.info(msg[, {args[, }*kwargs]])~
Logs a message with level INFO on this logger. The arguments are
interpreted as for debug.
Logger.warning(msg[, {args[, }*kwargs]])~
Logs a message with level WARNING on this logger. The arguments are
interpreted as for debug.
Logger.error(msg[, {args[, }*kwargs]])~
Logs a message with level ERROR on this logger. The arguments are
interpreted as for debug.
Logger.critical(msg[, {args[, }*kwargs]])~
Logs a message with level CRITICAL on this logger. The arguments are
interpreted as for debug.
Logger.log(lvl, msg[, {args[, }*kwargs]])~
Logs a message with integer level {lvl} on this logger. The other arguments are
interpreted as for debug.
Logger.exception(msg[, *args])~
Logs a message with level ERROR on this logger. The arguments are
interpreted as for debug. Exception info is added to the logging
message. This method should only be called from an exception handler.
Logger.addFilter(filt)~
Adds the specified filter {filt} to this logger.
Logger.removeFilter(filt)~
Removes the specified filter {filt} from this logger.
Logger.filter(record)~
Applies this logger's filters to the record and returns a true value if the
record is to be processed.
Logger.addHandler(hdlr)~
Adds the specified handler {hdlr} to this logger.
Logger.removeHandler(hdlr)~
Removes the specified handler {hdlr} from this logger.
Logger.findCaller()~
Finds the caller's source filename and line number. Returns the filename, line
number and function name as a 3-element tuple.
.. versionchanged:: 2.4
The function name was added. In earlier versions, the filename and line number
were returned as a 2-element tuple..
Logger.handle(record)~
Handles a record by passing it to all handlers associated with this logger and
its ancestors (until a false value of {propagate} is found). This method is used
for unpickled records received from a socket, as well as those created locally.
Logger-level filtering is applied using Logger.filter.
Logger.makeRecord(name, lvl, fn, lno, msg, args, exc_info [, func, extra])~
This is a factory method which can be overridden in subclasses to create
specialized LogRecord instances.
.. versionchanged:: 2.5
{func} and {extra} were added.
Basic example
-------------
.. versionchanged:: 2.4
formerly basicConfig did not take any keyword arguments.
The logging (|py2stdlib-logging|) package provides a lot of flexibility, and its configuration
can appear daunting. This section demonstrates that simple use of the logging
package is possible.
The simplest example shows logging to the console:: >
import logging
logging.debug('A debug message')
logging.info('Some information')
logging.warning('A shot across the bows')
<
If you run the above script, you'll see this::
WARNING:root:A shot across the bows
Because no particular logger was specified, the system used the root logger. The
debug and info messages didn't appear because by default, the root logger is
configured to only handle messages with a severity of WARNING or above. The
message format is also a configuration default, as is the output destination of
the messages - ``sys.stderr``. The severity level, the message format and
destination can be easily changed, as shown in the example below:: >
import logging
logging.basicConfig(level=logging.DEBUG,
format='%(asctime)s %(levelname)s %(message)s',
filename='myapp.log',
filemode='w')
logging.debug('A debug message')
logging.info('Some information')
logging.warning('A shot across the bows')
<
The basicConfig method is used to change the configuration defaults,
which results in output (written to ``myapp.log``) which should look
something like the following:: >
2004-07-02 13:00:08,743 DEBUG A debug message
2004-07-02 13:00:08,743 INFO Some information
2004-07-02 13:00:08,743 WARNING A shot across the bows
<
This time, all messages with a severity of DEBUG or above were handled, and the
format of the messages was also changed, and output went to the specified file
rather than the console.
Formatting uses standard Python string formatting - see section
string-formatting. The format string takes the following common
specifiers. For a complete list of specifiers, consult the Formatter
documentation.
+-------------------+-----------------------------------------------+
| Format | Description |
+===================+===============================================+
| ``%(name)s`` | Name of the logger (logging channel). |
+-------------------+-----------------------------------------------+
| ``%(levelname)s`` | Text logging level for the message |
| | (``'DEBUG'``, ``'INFO'``, ``'WARNING'``, |
| | ``'ERROR'``, ``'CRITICAL'``). |
+-------------------+-----------------------------------------------+
| ``%(asctime)s`` | Human-readable time when the |
| | LogRecord was created. By default |
| | this is of the form "2003-07-08 16:49:45,896" |
| | (the numbers after the comma are millisecond |
| | portion of the time). |
+-------------------+-----------------------------------------------+
| ``%(message)s`` | The logged message. |
+-------------------+-----------------------------------------------+
To change the date/time format, you can pass an additional keyword parameter,
{datefmt}, as in the following:: >
import logging
logging.basicConfig(level=logging.DEBUG,
format='%(asctime)s %(levelname)-8s %(message)s',
datefmt='%a, %d %b %Y %H:%M:%S',
filename='/temp/myapp.log',
filemode='w')
logging.debug('A debug message')
logging.info('Some information')
logging.warning('A shot across the bows')
<
which would result in output like ::
Fri, 02 Jul 2004 13:06:18 DEBUG A debug message
Fri, 02 Jul 2004 13:06:18 INFO Some information
Fri, 02 Jul 2004 13:06:18 WARNING A shot across the bows
The date format string follows the requirements of strftime - see the
documentation for the time (|py2stdlib-time|) module.
If, instead of sending logging output to the console or a file, you'd rather use
a file-like object which you have created separately, you can pass it to
basicConfig using the {stream} keyword argument. Note that if both
{stream} and {filename} keyword arguments are passed, the {stream} argument is
ignored.
Of course, you can put variable information in your output. To do this, simply
have the message be a format string and pass in additional arguments containing
the variable information, as in the following example:: >
import logging
logging.basicConfig(level=logging.DEBUG,
format='%(asctime)s %(levelname)-8s %(message)s',
datefmt='%a, %d %b %Y %H:%M:%S',
filename='/temp/myapp.log',
filemode='w')
logging.error('Pack my box with %d dozen %s', 5, 'liquor jugs')
<
which would result in ::
Wed, 21 Jul 2004 15:35:16 ERROR Pack my box with 5 dozen liquor jugs
Logging to multiple destinations
--------------------------------
Let's say you want to log to console and file with different message formats and
in differing circumstances. Say you want to log messages with levels of DEBUG
and higher to file, and those messages at level INFO and higher to the console.
Let's also assume that the file should contain timestamps, but the console
messages should not. Here's how you can achieve this:: >
import logging
# set up logging to file - see previous section for more details
logging.basicConfig(level=logging.DEBUG,
format='%(asctime)s %(name)-12s %(levelname)-8s %(message)s',
datefmt='%m-%d %H:%M',
filename='/temp/myapp.log',
filemode='w')
# define a Handler which writes INFO messages or higher to the sys.stderr
console = logging.StreamHandler()
console.setLevel(logging.INFO)
# set a format which is simpler for console use
formatter = logging.Formatter('%(name)-12s: %(levelname)-8s %(message)s')
# tell the handler to use this format
console.setFormatter(formatter)
# add the handler to the root logger
logging.getLogger('').addHandler(console)
# Now, we can log to the root logger, or any other logger. First the root...
logging.info('Jackdaws love my big sphinx of quartz.')
# Now, define a couple of other loggers which might represent areas in your
# application:
logger1 = logging.getLogger('myapp.area1')
logger2 = logging.getLogger('myapp.area2')
logger1.debug('Quick zephyrs blow, vexing daft Jim.')
logger1.info('How quickly daft jumping zebras vex.')
logger2.warning('Jail zesty vixen who grabbed pay from quack.')
logger2.error('The five boxing wizards jump quickly.')
<
When you run this, on the console you will see ::
root : INFO Jackdaws love my big sphinx of quartz.
myapp.area1 : INFO How quickly daft jumping zebras vex.
myapp.area2 : WARNING Jail zesty vixen who grabbed pay from quack.
myapp.area2 : ERROR The five boxing wizards jump quickly.
and in the file you will see something like :: >
10-22 22:19 root INFO Jackdaws love my big sphinx of quartz.
10-22 22:19 myapp.area1 DEBUG Quick zephyrs blow, vexing daft Jim.
10-22 22:19 myapp.area1 INFO How quickly daft jumping zebras vex.
10-22 22:19 myapp.area2 WARNING Jail zesty vixen who grabbed pay from quack.
10-22 22:19 myapp.area2 ERROR The five boxing wizards jump quickly.
<
As you can see, the DEBUG message only shows up in the file. The other messages
are sent to both destinations.
This example uses console and file handlers, but you can use any number and
combination of handlers you choose.
Exceptions raised during logging
--------------------------------
The logging package is designed to swallow exceptions which occur while logging
in production. This is so that errors which occur while handling logging events
- such as logging misconfiguration, network or other similar errors - do not
cause the application using logging to terminate prematurely.
SystemExit and KeyboardInterrupt exceptions are never
swallowed. Other exceptions which occur during the emit method of a
Handler subclass are passed to its handleError method.
The default implementation of handleError in Handler checks
to see if a module-level variable, raiseExceptions, is set. If set, a
traceback is printed to sys.stderr. If not set, the exception is swallowed.
{Note:}* The default value of raiseExceptions is ``True``. This is because
during development, you typically want to be notified of any exceptions that
occur. It's advised that you set raiseExceptions to ``False`` for production
usage.
Adding contextual information to your logging output
----------------------------------------------------
Sometimes you want logging output to contain contextual information in
addition to the parameters passed to the logging call. For example, in a
networked application, it may be desirable to log client-specific information
in the log (e.g. remote client's username, or IP address). Although you could
use the {extra} parameter to achieve this, it's not always convenient to pass
the information in this way. While it might be tempting to create
Logger instances on a per-connection basis, this is not a good idea
because these instances are not garbage collected. While this is not a problem
in practice, when the number of Logger instances is dependent on the
level of granularity you want to use in logging an application, it could
be hard to manage if the number of Logger instances becomes
effectively unbounded.
An easy way in which you can pass contextual information to be output along
with logging event information is to use the LoggerAdapter class.
This class is designed to look like a Logger, so that you can call
debug, info, warning, error,
exception, critical and log. These methods have the
same signatures as their counterparts in Logger, so you can use the
two types of instances interchangeably.
When you create an instance of LoggerAdapter, you pass it a
Logger instance and a dict-like object which contains your contextual
information. When you call one of the logging methods on an instance of
LoggerAdapter, it delegates the call to the underlying instance of
Logger passed to its constructor, and arranges to pass the contextual
information in the delegated call. Here's a snippet from the code of
LoggerAdapter:: >
def debug(self, msg, {args, }*kwargs):
"""
Delegate a debug call to the underlying logger, after adding
contextual information from this adapter instance.
"""
msg, kwargs = self.process(msg, kwargs)
self.logger.debug(msg, {args, }*kwargs)
<
The process method of LoggerAdapter is where the contextual
information is added to the logging output. It's passed the message and
keyword arguments of the logging call, and it passes back (potentially)
modified versions of these to use in the call to the underlying logger. The
default implementation of this method leaves the message alone, but inserts
an "extra" key in the keyword argument whose value is the dict-like object
passed to the constructor. Of course, if you had passed an "extra" keyword
argument in the call to the adapter, it will be silently overwritten.
The advantage of using "extra" is that the values in the dict-like object are
merged into the LogRecord instance's __dict__, allowing you to use
customized strings with your Formatter instances which know about
the keys of the dict-like object. If you need a different method, e.g. if you
want to prepend or append the contextual information to the message string,
you just need to subclass LoggerAdapter and override process
to do what you need. Here's an example script which uses this class, which
also illustrates what dict-like behaviour is needed from an arbitrary
"dict-like" object for use in the constructor:: >
import logging
class ConnInfo:
"""
An example class which shows how an arbitrary class can be used as
the 'extra' context information repository passed to a LoggerAdapter.
"""
def __getitem__(self, name):
"""
To allow this instance to look like a dict.
"""
from random import choice
if name == "ip":
result = choice(["127.0.0.1", "192.168.0.1"])
elif name == "user":
result = choice(["jim", "fred", "sheila"])
else:
result = self.__dict__.get(name, "?")
return result
def __iter__(self):
"""
To allow iteration over keys, which will be merged into
the LogRecord dict before formatting and output.
"""
keys = ["ip", "user"]
keys.extend(self.__dict__.keys())
return keys.__iter__()
if __name__ == "__main__":
from random import choice
levels = (logging.DEBUG, logging.INFO, logging.WARNING, logging.ERROR, logging.CRITICAL)
a1 = logging.LoggerAdapter(logging.getLogger("a.b.c"),
{ "ip" : "123.231.231.123", "user" : "sheila" })
logging.basicConfig(level=logging.DEBUG,
format="%(asctime)-15s %(name)-5s %(levelname)-8s IP: %(ip)-15s User: %(user)-8s %(message)s")
a1.debug("A debug message")
a1.info("An info message with %s", "some parameters")
a2 = logging.LoggerAdapter(logging.getLogger("d.e.f"), ConnInfo())
for x in range(10):
lvl = choice(levels)
lvlname = logging.getLevelName(lvl)
a2.log(lvl, "A message at %s level with %d %s", lvlname, 2, "parameters")
<
When this script is run, the output should look something like this::
2008-01-18 14:49:54,023 a.b.c DEBUG IP: 123.231.231.123 User: sheila A debug message
2008-01-18 14:49:54,023 a.b.c INFO IP: 123.231.231.123 User: sheila An info message with some parameters
2008-01-18 14:49:54,023 d.e.f CRITICAL IP: 192.168.0.1 User: jim A message at CRITICAL level with 2 parameters
2008-01-18 14:49:54,033 d.e.f INFO IP: 192.168.0.1 User: jim A message at INFO level with 2 parameters
2008-01-18 14:49:54,033 d.e.f WARNING IP: 192.168.0.1 User: sheila A message at WARNING level with 2 parameters
2008-01-18 14:49:54,033 d.e.f ERROR IP: 127.0.0.1 User: fred A message at ERROR level with 2 parameters
2008-01-18 14:49:54,033 d.e.f ERROR IP: 127.0.0.1 User: sheila A message at ERROR level with 2 parameters
2008-01-18 14:49:54,033 d.e.f WARNING IP: 192.168.0.1 User: sheila A message at WARNING level with 2 parameters
2008-01-18 14:49:54,033 d.e.f WARNING IP: 192.168.0.1 User: jim A message at WARNING level with 2 parameters
2008-01-18 14:49:54,033 d.e.f INFO IP: 192.168.0.1 User: fred A message at INFO level with 2 parameters
2008-01-18 14:49:54,033 d.e.f WARNING IP: 192.168.0.1 User: sheila A message at WARNING level with 2 parameters
2008-01-18 14:49:54,033 d.e.f WARNING IP: 127.0.0.1 User: jim A message at WARNING level with 2 parameters
.. versionadded:: 2.6
The LoggerAdapter class was not present in previous versions.
Logging to a single file from multiple processes
------------------------------------------------
Although logging is thread-safe, and logging to a single file from multiple
threads in a single process {is} supported, logging to a single file from
{multiple processes} is {not} supported, because there is no standard way to
serialize access to a single file across multiple processes in Python. If you
need to log to a single file from multiple processes, the best way of doing
this is to have all the processes log to a SocketHandler, and have a
separate process which implements a socket server which reads from the socket
and logs to file. (If you prefer, you can dedicate one thread in one of the
existing processes to perform this function.) The following section documents
this approach in more detail and includes a working socket receiver which can
be used as a starting point for you to adapt in your own applications.
If you are using a recent version of Python which includes the
multiprocessing (|py2stdlib-multiprocessing|) module, you can write your own handler which uses the
Lock class from this module to serialize access to the file from
your processes. The existing FileHandler and subclasses do not make
use of multiprocessing (|py2stdlib-multiprocessing|) at present, though they may do so in the future.
Note that at present, the multiprocessing (|py2stdlib-multiprocessing|) module does not provide
working lock functionality on all platforms (see
http://bugs.python.org/issue3770).
Sending and receiving logging events across a network
-----------------------------------------------------
Let's say you want to send logging events across a network, and handle them at
the receiving end. A simple way of doing this is attaching a
SocketHandler instance to the root logger at the sending end:: >
import logging, logging.handlers
rootLogger = logging.getLogger('')
rootLogger.setLevel(logging.DEBUG)
socketHandler = logging.handlers.SocketHandler('localhost',
logging.handlers.DEFAULT_TCP_LOGGING_PORT)
# don't bother with a formatter, since a socket handler sends the event as
# an unformatted pickle
rootLogger.addHandler(socketHandler)
# Now, we can log to the root logger, or any other logger. First the root...
logging.info('Jackdaws love my big sphinx of quartz.')
# Now, define a couple of other loggers which might represent areas in your
# application:
logger1 = logging.getLogger('myapp.area1')
logger2 = logging.getLogger('myapp.area2')
logger1.debug('Quick zephyrs blow, vexing daft Jim.')
logger1.info('How quickly daft jumping zebras vex.')
logger2.warning('Jail zesty vixen who grabbed pay from quack.')
logger2.error('The five boxing wizards jump quickly.')
<
At the receiving end, you can set up a receiver using the SocketServer (|py2stdlib-socketserver|)
module. Here is a basic working example:: >
import cPickle
import logging
import logging.handlers
import SocketServer
import struct
class LogRecordStreamHandler(SocketServer.StreamRequestHandler):
"""Handler for a streaming logging request.
This basically logs the record using whatever logging policy is
configured locally.
"""
def handle(self):
"""
Handle multiple requests - each expected to be a 4-byte length,
followed by the LogRecord in pickle format. Logs the record
according to whatever policy is configured locally.
"""
while 1:
chunk = self.connection.recv(4)
if len(chunk) < 4:
break
slen = struct.unpack(">L", chunk)[0]
chunk = self.connection.recv(slen)
while len(chunk) < slen:
chunk = chunk + self.connection.recv(slen - len(chunk))
obj = self.unPickle(chunk)
record = logging.makeLogRecord(obj)
self.handleLogRecord(record)
def unPickle(self, data):
return cPickle.loads(data)
def handleLogRecord(self, record):
# if a name is specified, we use the named logger rather than the one
# implied by the record.
if self.server.logname is not None:
name = self.server.logname
else:
name = record.name
logger = logging.getLogger(name)
# N.B. EVERY record gets logged. This is because Logger.handle
# is normally called AFTER logger-level filtering. If you want
# to do filtering, do it at the client end to save wasting
# cycles and network bandwidth!
logger.handle(record)
class LogRecordSocketReceiver(SocketServer.ThreadingTCPServer):
"""simple TCP socket-based logging receiver suitable for testing.
"""
allow_reuse_address = 1
def __init__(self, host='localhost',
port=logging.handlers.DEFAULT_TCP_LOGGING_PORT,
handler=LogRecordStreamHandler):
SocketServer.ThreadingTCPServer.__init__(self, (host, port), handler)
self.abort = 0
self.timeout = 1
self.logname = None
def serve_until_stopped(self):
import select
abort = 0
while not abort:
rd, wr, ex = select.select([self.socket.fileno()],
[], [],
self.timeout)
if rd:
self.handle_request()
abort = self.abort
def main():
logging.basicConfig(
format="%(relativeCreated)5d %(name)-15s %(levelname)-8s %(message)s")
tcpserver = LogRecordSocketReceiver()
print "About to start TCP server..."
tcpserver.serve_until_stopped()
if __name__ == "__main__":
main()
<
First run the server, and then the client. On the client side, nothing is
printed on the console; on the server side, you should see something like:: >
About to start TCP server...
59 root INFO Jackdaws love my big sphinx of quartz.
59 myapp.area1 DEBUG Quick zephyrs blow, vexing daft Jim.
69 myapp.area1 INFO How quickly daft jumping zebras vex.
69 myapp.area2 WARNING Jail zesty vixen who grabbed pay from quack.
69 myapp.area2 ERROR The five boxing wizards jump quickly.
<
Using arbitrary objects as messages
In the preceding sections and examples, it has been assumed that the message
passed when logging the event is a string. However, this is not the only
possibility. You can pass an arbitrary object as a message, and its
__str__ method will be called when the logging system needs to convert
it to a string representation. In fact, if you want to, you can avoid
computing a string representation altogether - for example, the
SocketHandler emits an event by pickling it and sending it over the
wire.
Optimization
------------
Formatting of message arguments is deferred until it cannot be avoided.
However, computing the arguments passed to the logging method can also be
expensive, and you may want to avoid doing it if the logger will just throw
away your event. To decide what to do, you can call the isEnabledFor
method which takes a level argument and returns true if the event would be
created by the Logger for that level of call. You can write code like this:: >
if logger.isEnabledFor(logging.DEBUG):
logger.debug("Message with %s, %s", expensive_func1(),
expensive_func2())
<
so that if the logger's threshold is set above ``DEBUG``, the calls to
expensive_func1 and expensive_func2 are never made.
There are other optimizations which can be made for specific applications which
need more precise control over what logging information is collected. Here's a
list of things you can do to avoid processing during logging which you don't
need:
+-----------------------------------------------+----------------------------------------+
| What you don't want to collect | How to avoid collecting it |
+===============================================+========================================+
| Information about where calls were made from. | Set ``logging._srcfile`` to ``None``. |
+-----------------------------------------------+----------------------------------------+
| Threading information. | Set ``logging.logThreads`` to ``0``. |
+-----------------------------------------------+----------------------------------------+
| Process information. | Set ``logging.logProcesses`` to ``0``. |
+-----------------------------------------------+----------------------------------------+
Also note that the core logging module only includes the basic handlers. If
you don't import logging.handlers and logging.config, they won't
take up any memory.
Handler Objects
---------------
Handlers have the following attributes and methods. Note that Handler
is never instantiated directly; this class acts as a base for more useful
subclasses. However, the __init__ method in subclasses needs to call
Handler.__init__.
Handler.__init__(level=NOTSET)~
Initializes the Handler instance by setting its level, setting the list
of filters to the empty list and creating a lock (using createLock) for
serializing access to an I/O mechanism.
Handler.createLock()~
Initializes a thread lock which can be used to serialize access to underlying
I/O functionality which may not be threadsafe.
Handler.acquire()~
Acquires the thread lock created with createLock.
Handler.release()~
Releases the thread lock acquired with acquire.
Handler.setLevel(lvl)~
Sets the threshold for this handler to {lvl}. Logging messages which are less
severe than {lvl} will be ignored. When a handler is created, the level is set
to NOTSET (which causes all messages to be processed).
Handler.setFormatter(form)~
Sets the Formatter for this handler to {form}.
Handler.addFilter(filt)~
Adds the specified filter {filt} to this handler.
Handler.removeFilter(filt)~
Removes the specified filter {filt} from this handler.
Handler.filter(record)~
Applies this handler's filters to the record and returns a true value if the
record is to be processed.
Handler.flush()~
Ensure all logging output has been flushed. This version does nothing and is
intended to be implemented by subclasses.
Handler.close()~
Tidy up any resources used by the handler. This version does no output but
removes the handler from an internal list of handlers which is closed when
shutdown is called. Subclasses should ensure that this gets called
from overridden close methods.
Handler.handle(record)~
Conditionally emits the specified logging record, depending on filters which may
have been added to the handler. Wraps the actual emission of the record with
acquisition/release of the I/O thread lock.
Handler.handleError(record)~
This method should be called from handlers when an exception is encountered
during an emit call. By default it does nothing, which means that
exceptions get silently ignored. This is what is mostly wanted for a logging
system - most users will not care about errors in the logging system, they are
more interested in application errors. You could, however, replace this with a
custom handler if you wish. The specified record is the one which was being
processed when the exception occurred.
Handler.format(record)~
Do formatting for a record - if a formatter is set, use it. Otherwise, use the
default formatter for the module.
Handler.emit(record)~
Do whatever it takes to actually log the specified logging record. This version
is intended to be implemented by subclasses and so raises a
NotImplementedError.
StreamHandler
^^^^^^^^^^^^^
The StreamHandler class, located in the core logging (|py2stdlib-logging|) package,
sends logging output to streams such as {sys.stdout}, {sys.stderr} or any
file-like object (or, more precisely, any object which supports write
and flush methods).
.. currentmodule:: logging
StreamHandler([stream])~
Returns a new instance of the StreamHandler class. If {stream} is
specified, the instance will use it for logging output; otherwise, {sys.stderr}
will be used.
emit(record)~
If a formatter is specified, it is used to format the record. The record
is then written to the stream with a trailing newline. If exception
information is present, it is formatted using
traceback.print_exception and appended to the stream.
flush()~
Flushes the stream by calling its flush method. Note that the
close method is inherited from Handler and so does
no output, so an explicit flush call may be needed at times.
FileHandler
^^^^^^^^^^^
The FileHandler class, located in the core logging (|py2stdlib-logging|) package,
sends logging output to a disk file. It inherits the output functionality from
StreamHandler.
FileHandler(filename[, mode[, encoding[, delay]]])~
Returns a new instance of the FileHandler class. The specified file is
opened and used as the stream for logging. If {mode} is not specified,
'a' is used. If {encoding} is not {None}, it is used to open the file
with that encoding. If {delay} is true, then file opening is deferred until the
first call to emit. By default, the file grows indefinitely.
.. versionchanged:: 2.6
{delay} was added.
close()~
Closes the file.
emit(record)~
Outputs the record to the file.
NullHandler
^^^^^^^^^^^
.. versionadded:: 2.7
The NullHandler class, located in the core logging (|py2stdlib-logging|) package,
does not do any formatting or output. It is essentially a "no-op" handler
for use by library developers.
NullHandler()~
Returns a new instance of the NullHandler class.
emit(record)~
This method does nothing.
See library-config for more information on how to use
NullHandler.
WatchedFileHandler
^^^^^^^^^^^^^^^^^^
.. versionadded:: 2.6
.. currentmodule:: logging.handlers
The WatchedFileHandler class, located in the logging.handlers
module, is a FileHandler which watches the file it is logging to. If
the file changes, it is closed and reopened using the file name.
A file change can happen because of usage of programs such as {newsyslog} and
{logrotate} which perform log file rotation. This handler, intended for use
under Unix/Linux, watches the file to see if it has changed since the last emit.
(A file is deemed to have changed if its device or inode have changed.) If the
file has changed, the old file stream is closed, and the file opened to get a
new stream.
This handler is not appropriate for use under Windows, because under Windows
open log files cannot be moved or renamed - logging opens the files with
exclusive locks - and so there is no need for such a handler. Furthermore,
{ST_INO} is not supported under Windows; stat (|py2stdlib-stat|) always returns zero for
this value.
WatchedFileHandler(filename[,mode[, encoding[, delay]]])~
Returns a new instance of the WatchedFileHandler class. The specified
file is opened and used as the stream for logging. If {mode} is not specified,
'a' is used. If {encoding} is not {None}, it is used to open the file
with that encoding. If {delay} is true, then file opening is deferred until the
first call to emit. By default, the file grows indefinitely.
.. versionchanged:: 2.6
{delay} was added.
emit(record)~
Outputs the record to the file, but first checks to see if the file has
changed. If it has, the existing stream is flushed and closed and the
file opened again, before outputting the record to the file.
RotatingFileHandler
^^^^^^^^^^^^^^^^^^^
The RotatingFileHandler class, located in the logging.handlers
module, supports rotation of disk log files.
RotatingFileHandler(filename[, mode[, maxBytes[, backupCount[, encoding[, delay]]]]])~
Returns a new instance of the RotatingFileHandler class. The specified
file is opened and used as the stream for logging. If {mode} is not specified,
``'a'`` is used. If {encoding} is not {None}, it is used to open the file
with that encoding. If {delay} is true, then file opening is deferred until the
first call to emit. By default, the file grows indefinitely.
You can use the {maxBytes} and {backupCount} values to allow the file to
rollover at a predetermined size. When the size is about to be exceeded,
the file is closed and a new file is silently opened for output. Rollover occurs
whenever the current log file is nearly {maxBytes} in length; if {maxBytes} is
zero, rollover never occurs. If {backupCount} is non-zero, the system will save
old log files by appending the extensions ".1", ".2" etc., to the filename. For
example, with a {backupCount} of 5 and a base file name of app.log, you
would get app.log, app.log.1, app.log.2, up to
app.log.5. The file being written to is always app.log. When
this file is filled, it is closed and renamed to app.log.1, and if files
app.log.1, app.log.2, etc. exist, then they are renamed to
app.log.2, app.log.3 etc. respectively.
.. versionchanged:: 2.6
{delay} was added.
doRollover()~
Does a rollover, as described above.
emit(record)~
Outputs the record to the file, catering for rollover as described
previously.
TimedRotatingFileHandler
^^^^^^^^^^^^^^^^^^^^^^^^
The TimedRotatingFileHandler class, located in the
logging.handlers module, supports rotation of disk log files at certain
timed intervals.
TimedRotatingFileHandler(filename [,when [,interval [,backupCount[, encoding[, delay[, utc]]]]]])~
Returns a new instance of the TimedRotatingFileHandler class. The
specified file is opened and used as the stream for logging. On rotating it also
sets the filename suffix. Rotating happens based on the product of {when} and
{interval}.
You can use the {when} to specify the type of {interval}. The list of possible
values is below. Note that they are not case sensitive.
+----------------+-----------------------+
| Value | Type of interval |
+================+=======================+
| ``'S'`` | Seconds |
+----------------+-----------------------+
| ``'M'`` | Minutes |
+----------------+-----------------------+
| ``'H'`` | Hours |
+----------------+-----------------------+
| ``'D'`` | Days |
+----------------+-----------------------+
| ``'W'`` | Week day (0=Monday) |
+----------------+-----------------------+
| ``'midnight'`` | Roll over at midnight |
+----------------+-----------------------+
The system will save old log files by appending extensions to the filename.
The extensions are date-and-time based, using the strftime format
``%Y-%m-%d_%H-%M-%S`` or a leading portion thereof, depending on the
rollover interval.
When computing the next rollover time for the first time (when the handler
is created), the last modification time of an existing log file, or else
the current time, is used to compute when the next rotation will occur.
If the {utc} argument is true, times in UTC will be used; otherwise
local time is used.
If {backupCount} is nonzero, at most {backupCount} files
will be kept, and if more would be created when rollover occurs, the oldest
one is deleted. The deletion logic uses the interval to determine which
files to delete, so changing the interval may leave old files lying around.
If {delay} is true, then file opening is deferred until the first call to
emit.
.. versionchanged:: 2.6
{delay} was added.
doRollover()~
Does a rollover, as described above.
emit(record)~
Outputs the record to the file, catering for rollover as described above.
SocketHandler
^^^^^^^^^^^^^
The SocketHandler class, located in the logging.handlers module,
sends logging output to a network socket. The base class uses a TCP socket.
SocketHandler(host, port)~
Returns a new instance of the SocketHandler class intended to
communicate with a remote machine whose address is given by {host} and {port}.
close()~
Closes the socket.
emit()~
Pickles the record's attribute dictionary and writes it to the socket in
binary format. If there is an error with the socket, silently drops the
packet. If the connection was previously lost, re-establishes the
connection. To unpickle the record at the receiving end into a
LogRecord, use the makeLogRecord function.
handleError()~
Handles an error which has occurred during emit. The most likely
cause is a lost connection. Closes the socket so that we can retry on the
next event.
makeSocket()~
This is a factory method which allows subclasses to define the precise
type of socket they want. The default implementation creates a TCP socket
(socket.SOCK_STREAM).
makePickle(record)~
Pickles the record's attribute dictionary in binary format with a length
prefix, and returns it ready for transmission across the socket.
Note that pickles aren't completely secure. If you are concerned about
security, you may want to override this method to implement a more secure
mechanism. For example, you can sign pickles using HMAC and then verify
them on the receiving end, or alternatively you can disable unpickling of
global objects on the receiving end.
send(packet)~
Send a pickled string {packet} to the socket. This function allows for
partial sends which can happen when the network is busy.
DatagramHandler
^^^^^^^^^^^^^^^
The DatagramHandler class, located in the logging.handlers
module, inherits from SocketHandler to support sending logging messages
over UDP sockets.
DatagramHandler(host, port)~
Returns a new instance of the DatagramHandler class intended to
communicate with a remote machine whose address is given by {host} and {port}.
emit()~
Pickles the record's attribute dictionary and writes it to the socket in
binary format. If there is an error with the socket, silently drops the
packet. To unpickle the record at the receiving end into a
LogRecord, use the makeLogRecord function.
makeSocket()~
The factory method of SocketHandler is here overridden to create
a UDP socket (socket.SOCK_DGRAM).
send(s)~
Send a pickled string to a socket.
SysLogHandler
^^^^^^^^^^^^^
The SysLogHandler class, located in the logging.handlers module,
supports sending logging messages to a remote or local Unix syslog.
SysLogHandler([address[, facility[, socktype]]])~
Returns a new instance of the SysLogHandler class intended to
communicate with a remote Unix machine whose address is given by {address} in
the form of a ``(host, port)`` tuple. If {address} is not specified,
``('localhost', 514)`` is used. The address is used to open a socket. An
alternative to providing a ``(host, port)`` tuple is providing an address as a
string, for example "/dev/log". In this case, a Unix domain socket is used to
send the message to the syslog. If {facility} is not specified,
LOG_USER is used. The type of socket opened depends on the
{socktype} argument, which defaults to socket.SOCK_DGRAM and thus
opens a UDP socket. To open a TCP socket (for use with the newer syslog
daemons such as rsyslog), specify a value of socket.SOCK_STREAM.
.. versionchanged:: 2.7
{socktype} was added.
close()~
Closes the socket to the remote host.
emit(record)~
The record is formatted, and then sent to the syslog server. If exception
information is present, it is {not} sent to the server.
encodePriority(facility, priority)~
Encodes the facility and priority into an integer. You can pass in strings
or integers - if strings are passed, internal mapping dictionaries are
used to convert them to integers.
The symbolic ``LOG_`` values are defined in SysLogHandler and
mirror the values defined in the ``sys/syslog.h`` header file.
{Priorities}*
+--------------------------+---------------+
| Name (string) | Symbolic value|
+==========================+===============+
| ``alert`` | LOG_ALERT |
+--------------------------+---------------+
| ``crit`` or ``critical`` | LOG_CRIT |
+--------------------------+---------------+
| ``debug`` | LOG_DEBUG |
+--------------------------+---------------+
| ``emerg`` or ``panic`` | LOG_EMERG |
+--------------------------+---------------+
| ``err`` or ``error`` | LOG_ERR |
+--------------------------+---------------+
| ``info`` | LOG_INFO |
+--------------------------+---------------+
| ``notice`` | LOG_NOTICE |
+--------------------------+---------------+
| ``warn`` or ``warning`` | LOG_WARNING |
+--------------------------+---------------+
{Facilities}*
+---------------+---------------+
| Name (string) | Symbolic value|
+===============+===============+
| ``auth`` | LOG_AUTH |
+---------------+---------------+
| ``authpriv`` | LOG_AUTHPRIV |
+---------------+---------------+
| ``cron`` | LOG_CRON |
+---------------+---------------+
| ``daemon`` | LOG_DAEMON |
+---------------+---------------+
| ``ftp`` | LOG_FTP |
+---------------+---------------+
| ``kern`` | LOG_KERN |
+---------------+---------------+
| ``lpr`` | LOG_LPR |
+---------------+---------------+
| ``mail`` | LOG_MAIL |
+---------------+---------------+
| ``news`` | LOG_NEWS |
+---------------+---------------+
| ``syslog`` | LOG_SYSLOG |
+---------------+---------------+
| ``user`` | LOG_USER |
+---------------+---------------+
| ``uucp`` | LOG_UUCP |
+---------------+---------------+
| ``local0`` | LOG_LOCAL0 |
+---------------+---------------+
| ``local1`` | LOG_LOCAL1 |
+---------------+---------------+
| ``local2`` | LOG_LOCAL2 |
+---------------+---------------+
| ``local3`` | LOG_LOCAL3 |
+---------------+---------------+
| ``local4`` | LOG_LOCAL4 |
+---------------+---------------+
| ``local5`` | LOG_LOCAL5 |
+---------------+---------------+
| ``local6`` | LOG_LOCAL6 |
+---------------+---------------+
| ``local7`` | LOG_LOCAL7 |
+---------------+---------------+
mapPriority(levelname)~
Maps a logging level name to a syslog priority name.
You may need to override this if you are using custom levels, or
if the default algorithm is not suitable for your needs. The
default algorithm maps ``DEBUG``, ``INFO``, ``WARNING``, ``ERROR`` and
``CRITICAL`` to the equivalent syslog names, and all other level
names to "warning".
NTEventLogHandler
^^^^^^^^^^^^^^^^^
The NTEventLogHandler class, located in the logging.handlers
module, supports sending logging messages to a local Windows NT, Windows 2000 or
Windows XP event log. Before you can use it, you need Mark Hammond's Win32
extensions for Python installed.
NTEventLogHandler(appname[, dllname[, logtype]])~
Returns a new instance of the NTEventLogHandler class. The {appname} is
used to define the application name as it appears in the event log. An
appropriate registry entry is created using this name. The {dllname} should give
the fully qualified pathname of a .dll or .exe which contains message
definitions to hold in the log (if not specified, ``'win32service.pyd'`` is used
- this is installed with the Win32 extensions and contains some basic
placeholder message definitions. Note that use of these placeholders will make
your event logs big, as the entire message source is held in the log. If you
want slimmer logs, you have to pass in the name of your own .dll or .exe which
contains the message definitions you want to use in the event log). The
{logtype} is one of ``'Application'``, ``'System'`` or ``'Security'``, and
defaults to ``'Application'``.
close()~
At this point, you can remove the application name from the registry as a
source of event log entries. However, if you do this, you will not be able
to see the events as you intended in the Event Log Viewer - it needs to be
able to access the registry to get the .dll name. The current version does
not do this.
emit(record)~
Determines the message ID, event category and event type, and then logs
the message in the NT event log.
getEventCategory(record)~
Returns the event category for the record. Override this if you want to
specify your own categories. This version returns 0.
getEventType(record)~
Returns the event type for the record. Override this if you want to
specify your own types. This version does a mapping using the handler's
typemap attribute, which is set up in __init__ to a dictionary
which contains mappings for DEBUG, INFO,
WARNING, ERROR and CRITICAL. If you are using
your own levels, you will either need to override this method or place a
suitable dictionary in the handler's {typemap} attribute.
getMessageID(record)~
Returns the message ID for the record. If you are using your own messages,
you could do this by having the {msg} passed to the logger being an ID
rather than a format string. Then, in here, you could use a dictionary
lookup to get the message ID. This version returns 1, which is the base
message ID in win32service.pyd.
SMTPHandler
^^^^^^^^^^^
The SMTPHandler class, located in the logging.handlers module,
supports sending logging messages to an email address via SMTP.
SMTPHandler(mailhost, fromaddr, toaddrs, subject[, credentials])~
Returns a new instance of the SMTPHandler class. The instance is
initialized with the from and to addresses and subject line of the email. The
{toaddrs} should be a list of strings. To specify a non-standard SMTP port, use
the (host, port) tuple format for the {mailhost} argument. If you use a string,
the standard SMTP port is used. If your SMTP server requires authentication, you
can specify a (username, password) tuple for the {credentials} argument.
.. versionchanged:: 2.6
{credentials} was added.
emit(record)~
Formats the record and sends it to the specified addressees.
getSubject(record)~
If you want to specify a subject line which is record-dependent, override
this method.
MemoryHandler
^^^^^^^^^^^^^
The MemoryHandler class, located in the logging.handlers module,
supports buffering of logging records in memory, periodically flushing them to a
target handler. Flushing occurs whenever the buffer is full, or when an
event of a certain severity or greater is seen.
MemoryHandler is a subclass of the more general
BufferingHandler, which is an abstract class. This buffers logging
records in memory. Whenever each record is added to the buffer, a check is made
by calling shouldFlush to see if the buffer should be flushed. If it
should, then flush is expected to do the needful.
BufferingHandler(capacity)~
Initializes the handler with a buffer of the specified capacity.
emit(record)~
Appends the record to the buffer. If shouldFlush returns true,
calls flush to process the buffer.
flush()~
You can override this to implement custom flushing behavior. This version
just zaps the buffer to empty.
shouldFlush(record)~
Returns true if the buffer is up to capacity. This method can be
overridden to implement custom flushing strategies.
MemoryHandler(capacity[, flushLevel [, target]])~
Returns a new instance of the MemoryHandler class. The instance is
initialized with a buffer size of {capacity}. If {flushLevel} is not specified,
ERROR is used. If no {target} is specified, the target will need to be
set using setTarget before this handler does anything useful.
close()~
Calls flush, sets the target to None and clears the
buffer.
flush()~
For a MemoryHandler, flushing means just sending the buffered
records to the target, if there is one. Override if you want different
behavior.
setTarget(target)~
Sets the target handler for this handler.
shouldFlush(record)~
Checks for buffer full or a record at the {flushLevel} or higher.
HTTPHandler
^^^^^^^^^^^
The HTTPHandler class, located in the logging.handlers module,
supports sending logging messages to a Web server, using either ``GET`` or
``POST`` semantics.
HTTPHandler(host, url[, method])~
Returns a new instance of the HTTPHandler class. The instance is
initialized with a host address, url and HTTP method. The {host} can be of the
form ``host:port``, should you need to use a specific port number. If no
{method} is specified, ``GET`` is used.
emit(record)~
Sends the record to the Web server as an URL-encoded dictionary.
Formatter Objects
-----------------
.. currentmodule:: logging
Formatter\ s have the following attributes and methods. They are
responsible for converting a LogRecord to (usually) a string which can
be interpreted by either a human or an external system. The base
Formatter allows a formatting string to be specified. If none is
supplied, the default value of ``'%(message)s'`` is used.
A Formatter can be initialized with a format string which makes use of knowledge
of the LogRecord attributes - such as the default value mentioned above
making use of the fact that the user's message and arguments are pre-formatted
into a LogRecord's {message} attribute. This format string contains
standard Python %-style mapping keys. See section string-formatting
for more information on string formatting.
Currently, the useful mapping keys in a LogRecord are:
+-------------------------+-----------------------------------------------+
| Format | Description |
+=========================+===============================================+
| ``%(name)s`` | Name of the logger (logging channel). |
+-------------------------+-----------------------------------------------+
| ``%(levelno)s`` | Numeric logging level for the message |
| | (DEBUG, INFO, |
| | WARNING, ERROR, |
| | CRITICAL). |
+-------------------------+-----------------------------------------------+
| ``%(levelname)s`` | Text logging level for the message |
| | (``'DEBUG'``, ``'INFO'``, ``'WARNING'``, |
| | ``'ERROR'``, ``'CRITICAL'``). |
+-------------------------+-----------------------------------------------+
| ``%(pathname)s`` | Full pathname of the source file where the |
| | logging call was issued (if available). |
+-------------------------+-----------------------------------------------+
| ``%(filename)s`` | Filename portion of pathname. |
+-------------------------+-----------------------------------------------+
| ``%(module)s`` | Module (name portion of filename). |
+-------------------------+-----------------------------------------------+
| ``%(funcName)s`` | Name of function containing the logging call. |
+-------------------------+-----------------------------------------------+
| ``%(lineno)d`` | Source line number where the logging call was |
| | issued (if available). |
+-------------------------+-----------------------------------------------+
| ``%(created)f`` | Time when the LogRecord was created |
| | (as returned by time.time). |
+-------------------------+-----------------------------------------------+
| ``%(relativeCreated)d`` | Time in milliseconds when the LogRecord was |
| | created, relative to the time the logging |
| | module was loaded. |
+-------------------------+-----------------------------------------------+
| ``%(asctime)s`` | Human-readable time when the |
| | LogRecord was created. By default |
| | this is of the form "2003-07-08 16:49:45,896" |
| | (the numbers after the comma are millisecond |
| | portion of the time). |
+-------------------------+-----------------------------------------------+
| ``%(msecs)d`` | Millisecond portion of the time when the |
| | LogRecord was created. |
+-------------------------+-----------------------------------------------+
| ``%(thread)d`` | Thread ID (if available). |
+-------------------------+-----------------------------------------------+
| ``%(threadName)s`` | Thread name (if available). |
+-------------------------+-----------------------------------------------+
| ``%(process)d`` | Process ID (if available). |
+-------------------------+-----------------------------------------------+
| ``%(message)s`` | The logged message, computed as ``msg % |
| | args``. |
+-------------------------+-----------------------------------------------+
.. versionchanged:: 2.5
{funcName} was added.
Formatter([fmt[, datefmt]])~
Returns a new instance of the Formatter class. The instance is
initialized with a format string for the message as a whole, as well as a format
string for the date/time portion of a message. If no {fmt} is specified,
``'%(message)s'`` is used. If no {datefmt} is specified, the ISO8601 date format
is used.
format(record)~
The record's attribute dictionary is used as the operand to a string
formatting operation. Returns the resulting string. Before formatting the
dictionary, a couple of preparatory steps are carried out. The {message}
attribute of the record is computed using {msg} % {args}. If the
formatting string contains ``'(asctime)'``, formatTime is called
to format the event time. If there is exception information, it is
formatted using formatException and appended to the message. Note
that the formatted exception information is cached in attribute
{exc_text}. This is useful because the exception information can be
pickled and sent across the wire, but you should be careful if you have
more than one Formatter subclass which customizes the formatting
of exception information. In this case, you will have to clear the cached
value after a formatter has done its formatting, so that the next
formatter to handle the event doesn't use the cached value but
recalculates it afresh.
formatTime(record[, datefmt])~
This method should be called from format by a formatter which
wants to make use of a formatted time. This method can be overridden in
formatters to provide for any specific requirement, but the basic behavior
is as follows: if {datefmt} (a string) is specified, it is used with
time.strftime to format the creation time of the
record. Otherwise, the ISO8601 format is used. The resulting string is
returned.
formatException(exc_info)~
Formats the specified exception information (a standard exception tuple as
returned by sys.exc_info) as a string. This default implementation
just uses traceback.print_exception. The resulting string is
returned.
Filter Objects
--------------
Filters can be used by Handler\ s and Logger\ s for
more sophisticated filtering than is provided by levels. The base filter class
only allows events which are below a certain point in the logger hierarchy. For
example, a filter initialized with "A.B" will allow events logged by loggers
"A.B", "A.B.C", "A.B.C.D", "A.B.D" etc. but not "A.BB", "B.A.B" etc. If
initialized with the empty string, all events are passed.
Filter([name])~
Returns an instance of the Filter class. If {name} is specified, it
names a logger which, together with its children, will have its events allowed
through the filter. If no name is specified, allows every event.
filter(record)~
Is the specified record to be logged? Returns zero for no, nonzero for
yes. If deemed appropriate, the record may be modified in-place by this
method.
LogRecord Objects
-----------------
LogRecord instances are created every time something is logged. They
contain all the information pertinent to the event being logged. The main
information passed in is in msg and args, which are combined using msg % args to
create the message field of the record. The record also includes information
such as when the record was created, the source line where the logging call was
made, and any exception information to be logged.
LogRecord(name, lvl, pathname, lineno, msg, args, exc_info [, func])~
Returns an instance of LogRecord initialized with interesting
information. The {name} is the logger name; {lvl} is the numeric level;
{pathname} is the absolute pathname of the source file in which the logging
call was made; {lineno} is the line number in that file where the logging
call is found; {msg} is the user-supplied message (a format string); {args}
is the tuple which, together with {msg}, makes up the user message; and
{exc_info} is the exception tuple obtained by calling sys.exc_info
(or None, if no exception information is available). The {func} is
the name of the function from which the logging call was made. If not
specified, it defaults to ``None``.
.. versionchanged:: 2.5
{func} was added.
getMessage()~
Returns the message for this LogRecord instance after merging any
user-supplied arguments with the message.
LoggerAdapter Objects
---------------------
.. versionadded:: 2.6
LoggerAdapter instances are used to conveniently pass contextual
information into logging calls. For a usage example , see the section on
`adding contextual information to your logging output`__.
__ context-info_
LoggerAdapter(logger, extra)~
Returns an instance of LoggerAdapter initialized with an
underlying Logger instance and a dict-like object.
process(msg, kwargs)~
Modifies the message and/or keyword arguments passed to a logging call in
order to insert contextual information. This implementation takes the object
passed as {extra} to the constructor and adds it to {kwargs} using key
'extra'. The return value is a ({msg}, {kwargs}) tuple which has the
(possibly modified) versions of the arguments passed in.
In addition to the above, LoggerAdapter supports all the logging
methods of Logger, i.e. debug, info, warning,
error, exception, critical and log. These
methods have the same signatures as their counterparts in Logger, so
you can use the two types of instances interchangeably.
.. versionchanged:: 2.7
The isEnabledFor method was added to LoggerAdapter. This method
delegates to the underlying logger.
Thread Safety
-------------
The logging module is intended to be thread-safe without any special work
needing to be done by its clients. It achieves this though using threading
locks; there is one lock to serialize access to the module's shared data, and
each handler also creates a lock to serialize access to its underlying I/O.
If you are implementing asynchronous signal handlers using the signal (|py2stdlib-signal|)
module, you may not be able to use logging from within such handlers. This is
because lock implementations in the threading (|py2stdlib-threading|) module are not always
re-entrant, and so cannot be invoked from such signal handlers.
Integration with the warnings module
------------------------------------
The captureWarnings function can be used to integrate logging (|py2stdlib-logging|)
with the warnings (|py2stdlib-warnings|) module.
captureWarnings(capture)~
This function is used to turn the capture of warnings by logging on and
off.
If {capture} is ``True``, warnings issued by the warnings (|py2stdlib-warnings|) module
will be redirected to the logging system. Specifically, a warning will be
formatted using warnings.formatwarning and the resulting string
logged to a logger named "py.warnings" with a severity of ``WARNING``.
If {capture} is ``False``, the redirection of warnings to the logging system
will stop, and warnings will be redirected to their original destinations
(i.e. those in effect before ``captureWarnings(True)`` was called).
Configuration
-------------
Configuration functions
^^^^^^^^^^^^^^^^^^^^^^^
The following functions configure the logging module. They are located in the
logging.config module. Their use is optional --- you can configure the
logging module using these functions or by making calls to the main API (defined
in logging (|py2stdlib-logging|) itself) and defining handlers which are declared either in
logging (|py2stdlib-logging|) or logging.handlers.
dictConfig(config)~
Takes the logging configuration from a dictionary. The contents of
this dictionary are described in logging-config-dictschema
below.
If an error is encountered during configuration, this function will
raise a ValueError, TypeError, AttributeError
or ImportError with a suitably descriptive message. The
following is a (possibly incomplete) list of conditions which will
raise an error:
* A ``level`` which is not a string or which is a string not
corresponding to an actual logging level.
* A ``propagate`` value which is not a boolean.
* An id which does not have a corresponding destination.
* A non-existent handler id found during an incremental call.
* An invalid logger name.
* Inability to resolve to an internal or external object.
Parsing is performed by the DictConfigurator class, whose
constructor is passed the dictionary used for configuration, and
has a configure method. The logging.config module
has a callable attribute dictConfigClass
which is initially set to DictConfigurator.
You can replace the value of dictConfigClass with a
suitable implementation of your own.
dictConfig calls dictConfigClass passing
the specified dictionary, and then calls the configure method on
the returned object to put the configuration into effect:: >
def dictConfig(config):
dictConfigClass(config).configure()
<
For example, a subclass of DictConfigurator could call
``DictConfigurator.__init__()`` in its own __init__(), then
set up custom prefixes which would be usable in the subsequent
configure call. dictConfigClass would be bound to
this new subclass, and then dictConfig could be called exactly as
in the default, uncustomized state.
fileConfig(fname[, defaults])~
Reads the logging configuration from a ConfigParser (|py2stdlib-configparser|)\-format file named
{fname}. This function can be called several times from an application,
allowing an end user to select from various pre-canned
configurations (if the developer provides a mechanism to present the choices
and load the chosen configuration). Defaults to be passed to the ConfigParser
can be specified in the {defaults} argument.
listen([port])~
Starts up a socket server on the specified port, and listens for new
configurations. If no port is specified, the module's default
DEFAULT_LOGGING_CONFIG_PORT is used. Logging configurations will be
sent as a file suitable for processing by fileConfig. Returns a
Thread instance on which you can call start to start the
server, and which you can join when appropriate. To stop the server,
call stopListening.
To send a configuration to the socket, read in the configuration file and
send it to the socket as a string of bytes preceded by a four-byte length
string packed in binary using ``struct.pack('>L', n)``.
stopListening()~
Stops the listening server which was created with a call to listen.
This is typically called before calling join on the return value from
listen.
Configuration dictionary schema
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Describing a logging configuration requires listing the various
objects to create and the connections between them; for example, you
may create a handler named "console" and then say that the logger
named "startup" will send its messages to the "console" handler.
These objects aren't limited to those provided by the logging (|py2stdlib-logging|)
module because you might write your own formatter or handler class.
The parameters to these classes may also need to include external
objects such as ``sys.stderr``. The syntax for describing these
objects and connections is defined in logging-config-dict-connections
below.
Dictionary Schema Details
"""""""""""""""""""""""""
The dictionary passed to dictConfig must contain the following
keys:
* `version` - to be set to an integer value representing the schema
version. The only valid value at present is 1, but having this key
allows the schema to evolve while still preserving backwards
compatibility.
All other keys are optional, but if present they will be interpreted
as described below. In all cases below where a 'configuring dict' is
mentioned, it will be checked for the special ``'()'`` key to see if a
custom instantiation is required. If so, the mechanism described in
logging-config-dict-userdef below is used to create an instance;
otherwise, the context is used to determine what to instantiate.
* `formatters` - the corresponding value will be a dict in which each
key is a formatter id and each value is a dict describing how to
configure the corresponding Formatter instance.
The configuring dict is searched for keys ``format`` and ``datefmt``
(with defaults of ``None``) and these are used to construct a
logging.Formatter instance.
* `filters` - the corresponding value will be a dict in which each key
is a filter id and each value is a dict describing how to configure
the corresponding Filter instance.
The configuring dict is searched for the key ``name`` (defaulting to the
empty string) and this is used to construct a logging.Filter
instance.
* `handlers` - the corresponding value will be a dict in which each
key is a handler id and each value is a dict describing how to
configure the corresponding Handler instance.
The configuring dict is searched for the following keys:
* ``class`` (mandatory). This is the fully qualified name of the
handler class.
* ``level`` (optional). The level of the handler.
* ``formatter`` (optional). The id of the formatter for this
handler.
* ``filters`` (optional). A list of ids of the filters for this
handler.
All {other} keys are passed through as keyword arguments to the
handler's constructor. For example, given the snippet:: >
handlers:
console:
class : logging.StreamHandler
formatter: brief
level : INFO
filters: [allow_foo]
stream : ext://sys.stdout
file:
class : logging.handlers.RotatingFileHandler
formatter: precise
filename: logconfig.log
maxBytes: 1024
backupCount: 3
<
the handler with id ``console`` is instantiated as a
logging.StreamHandler, using ``sys.stdout`` as the underlying
stream. The handler with id ``file`` is instantiated as a
logging.handlers.RotatingFileHandler with the keyword arguments
``filename='logconfig.log', maxBytes=1024, backupCount=3``.
* `loggers` - the corresponding value will be a dict in which each key
is a logger name and each value is a dict describing how to
configure the corresponding Logger instance.
The configuring dict is searched for the following keys:
* ``level`` (optional). The level of the logger.
* ``propagate`` (optional). The propagation setting of the logger.
* ``filters`` (optional). A list of ids of the filters for this
logger.
* ``handlers`` (optional). A list of ids of the handlers for this
logger.
The specified loggers will be configured according to the level,
propagation, filters and handlers specified.
* `root` - this will be the configuration for the root logger.
Processing of the configuration will be as for any logger, except
that the ``propagate`` setting will not be applicable.
* `incremental` - whether the configuration is to be interpreted as
incremental to the existing configuration. This value defaults to
``False``, which means that the specified configuration replaces the
existing configuration with the same semantics as used by the
existing fileConfig API.
If the specified value is ``True``, the configuration is processed
as described in the section on logging-config-dict-incremental.
* `disable_existing_loggers` - whether any existing loggers are to be
disabled. This setting mirrors the parameter of the same name in
fileConfig. If absent, this parameter defaults to ``True``.
This value is ignored if `incremental` is ``True``.
Incremental Configuration
"""""""""""""""""""""""""
It is difficult to provide complete flexibility for incremental
configuration. For example, because objects such as filters
and formatters are anonymous, once a configuration is set up, it is
not possible to refer to such anonymous objects when augmenting a
configuration.
Furthermore, there is not a compelling case for arbitrarily altering
the object graph of loggers, handlers, filters, formatters at
run-time, once a configuration is set up; the verbosity of loggers and
handlers can be controlled just by setting levels (and, in the case of
loggers, propagation flags). Changing the object graph arbitrarily in
a safe way is problematic in a multi-threaded environment; while not
impossible, the benefits are not worth the complexity it adds to the
implementation.
Thus, when the ``incremental`` key of a configuration dict is present
and is ``True``, the system will completely ignore any ``formatters`` and
``filters`` entries, and process only the ``level``
settings in the ``handlers`` entries, and the ``level`` and
``propagate`` settings in the ``loggers`` and ``root`` entries.
Using a value in the configuration dict lets configurations to be sent
over the wire as pickled dicts to a socket listener. Thus, the logging
verbosity of a long-running application can be altered over time with
no need to stop and restart the application.
Object connections
""""""""""""""""""
The schema describes a set of logging objects - loggers,
handlers, formatters, filters - which are connected to each other in
an object graph. Thus, the schema needs to represent connections
between the objects. For example, say that, once configured, a
particular logger has attached to it a particular handler. For the
purposes of this discussion, we can say that the logger represents the
source, and the handler the destination, of a connection between the
two. Of course in the configured objects this is represented by the
logger holding a reference to the handler. In the configuration dict,
this is done by giving each destination object an id which identifies
it unambiguously, and then using the id in the source object's
configuration to indicate that a connection exists between the source
and the destination object with that id.
So, for example, consider the following YAML snippet:: >
formatters:
brief:
# configuration for formatter with id 'brief' goes here
precise:
# configuration for formatter with id 'precise' goes here
handlers:
h1: #This is an id
# configuration of handler with id 'h1' goes here
formatter: brief
h2: #This is another id
# configuration of handler with id 'h2' goes here
formatter: precise
loggers:
foo.bar.baz:
# other configuration for logger 'foo.bar.baz'
handlers: [h1, h2]
<
(Note: YAML used here because it's a little more readable than the
equivalent Python source form for the dictionary.)
The ids for loggers are the logger names which would be used
programmatically to obtain a reference to those loggers, e.g.
``foo.bar.baz``. The ids for Formatters and Filters can be any string
value (such as ``brief``, ``precise`` above) and they are transient,
in that they are only meaningful for processing the configuration
dictionary and used to determine connections between objects, and are
not persisted anywhere when the configuration call is complete.
The above snippet indicates that logger named ``foo.bar.baz`` should
have two handlers attached to it, which are described by the handler
ids ``h1`` and ``h2``. The formatter for ``h1`` is that described by id
``brief``, and the formatter for ``h2`` is that described by id
``precise``.
User-defined objects
""""""""""""""""""""
The schema supports user-defined objects for handlers, filters and
formatters. (Loggers do not need to have different types for
different instances, so there is no support in this configuration
schema for user-defined logger classes.)
Objects to be configured are described by dictionaries
which detail their configuration. In some places, the logging system
will be able to infer from the context how an object is to be
instantiated, but when a user-defined object is to be instantiated,
the system will not know how to do this. In order to provide complete
flexibility for user-defined object instantiation, the user needs
to provide a 'factory' - a callable which is called with a
configuration dictionary and which returns the instantiated object.
This is signalled by an absolute import path to the factory being
made available under the special key ``'()'``. Here's a concrete
example:: >
formatters:
brief:
format: '%(message)s'
default:
format: '%(asctime)s %(levelname)-8s %(name)-15s %(message)s'
datefmt: '%Y-%m-%d %H:%M:%S'
custom:
(): my.package.customFormatterFactory
bar: baz
spam: 99.9
answer: 42
<
The above YAML snippet defines three formatters. The first, with id
``brief``, is a standard logging.Formatter instance with the
specified format string. The second, with id ``default``, has a
longer format and also defines the time format explicitly, and will
result in a logging.Formatter initialized with those two format
strings. Shown in Python source form, the ``brief`` and ``default``
formatters have configuration sub-dictionaries:: >
{
'format' : '%(message)s'
}
<
and::
{
'format' : '%(asctime)s %(levelname)-8s %(name)-15s %(message)s',
'datefmt' : '%Y-%m-%d %H:%M:%S'
}
respectively, and as these dictionaries do not contain the special key
``'()'``, the instantiation is inferred from the context: as a result,
standard logging.Formatter instances are created. The
configuration sub-dictionary for the third formatter, with id
``custom``, is:: >
{
'()' : 'my.package.customFormatterFactory',
'bar' : 'baz',
'spam' : 99.9,
'answer' : 42
}
<
and this contains the special key ``'()'``, which means that
user-defined instantiation is wanted. In this case, the specified
factory callable will be used. If it is an actual callable it will be
used directly - otherwise, if you specify a string (as in the example)
the actual callable will be located using normal import mechanisms.
The callable will be called with the {remaining}* items in the
configuration sub-dictionary as keyword arguments. In the above
example, the formatter with id ``custom`` will be assumed to be
returned by the call:: >
my.package.customFormatterFactory(bar='baz', spam=99.9, answer=42)
<
The key ``'()'`` has been used as the special key because it is not a
valid keyword parameter name, and so will not clash with the names of
the keyword arguments used in the call. The ``'()'`` also serves as a
mnemonic that the corresponding value is a callable.
Access to external objects
""""""""""""""""""""""""""
There are times where a configuration needs to refer to objects
external to the configuration, for example ``sys.stderr``. If the
configuration dict is constructed using Python code, this is
straightforward, but a problem arises when the configuration is
provided via a text file (e.g. JSON, YAML). In a text file, there is
no standard way to distinguish ``sys.stderr`` from the literal string
``'sys.stderr'``. To facilitate this distinction, the configuration
system looks for certain special prefixes in string values and
treat them specially. For example, if the literal string
``'ext://sys.stderr'`` is provided as a value in the configuration,
then the ``ext://`` will be stripped off and the remainder of the
value processed using normal import mechanisms.
The handling of such prefixes is done in a way analogous to protocol
handling: there is a generic mechanism to look for prefixes which
match the regular expression ``^(?P<prefix>[a-z]+)://(?P<suffix>.*)$``
whereby, if the ``prefix`` is recognised, the ``suffix`` is processed
in a prefix-dependent manner and the result of the processing replaces
the string value. If the prefix is not recognised, then the string
value will be left as-is.
Access to internal objects
""""""""""""""""""""""""""
As well as external objects, there is sometimes also a need to refer
to objects in the configuration. This will be done implicitly by the
configuration system for things that it knows about. For example, the
string value ``'DEBUG'`` for a ``level`` in a logger or handler will
automatically be converted to the value ``logging.DEBUG``, and the
``handlers``, ``filters`` and ``formatter`` entries will take an
object id and resolve to the appropriate destination object.
However, a more generic mechanism is needed for user-defined
objects which are not known to the logging (|py2stdlib-logging|) module. For
example, consider logging.handlers.MemoryHandler, which takes
a ``target`` argument which is another handler to delegate to. Since
the system already knows about this class, then in the configuration,
the given ``target`` just needs to be the object id of the relevant
target handler, and the system will resolve to the handler from the
id. If, however, a user defines a ``my.package.MyHandler`` which has
an ``alternate`` handler, the configuration system would not know that
the ``alternate`` referred to a handler. To cater for this, a generic
resolution system allows the user to specify:: >
handlers:
file:
# configuration of file handler goes here
custom:
(): my.package.MyHandler
alternate: cfg://handlers.file
<
The literal string ``'cfg://handlers.file'`` will be resolved in an
analogous way to strings with the ``ext://`` prefix, but looking
in the configuration itself rather than the import namespace. The
mechanism allows access by dot or by index, in a similar way to
that provided by ``str.format``. Thus, given the following snippet:: >
handlers:
email:
class: logging.handlers.SMTPHandler
mailhost: localhost
fromaddr: my_app@domain.tld
toaddrs:
- support_team@domain.tld
- dev_team@domain.tld
subject: Houston, we have a problem.
<
in the configuration, the string ``'cfg://handlers'`` would resolve to
the dict with key ``handlers``, the string ``'cfg://handlers.email``
would resolve to the dict with key ``email`` in the ``handlers`` dict,
and so on. The string ``'cfg://handlers.email.toaddrs[1]`` would
resolve to ``'dev_team.domain.tld'`` and the string
``'cfg://handlers.email.toaddrs[0]'`` would resolve to the value
``'support_team@domain.tld'``. The ``subject`` value could be accessed
using either ``'cfg://handlers.email.subject'`` or, equivalently,
``'cfg://handlers.email[subject]'``. The latter form only needs to be
used if the key contains spaces or non-alphanumeric characters. If an
index value consists only of decimal digits, access will be attempted
using the corresponding integer value, falling back to the string
value if needed.
Given a string ``cfg://handlers.myhandler.mykey.123``, this will
resolve to ``config_dict['handlers']['myhandler']['mykey']['123']``.
If the string is specified as ``cfg://handlers.myhandler.mykey[123]``,
the system will attempt to retrieve the value from
``config_dict['handlers']['myhandler']['mykey'][123]``, and fall back
to ``config_dict['handlers']['myhandler']['mykey']['123']`` if that
fails.
Configuration file format
^^^^^^^^^^^^^^^^^^^^^^^^^
The configuration file format understood by fileConfig is based on
ConfigParser (|py2stdlib-configparser|) functionality. The file must contain sections called
``[loggers]``, ``[handlers]`` and ``[formatters]`` which identify by name the
entities of each type which are defined in the file. For each such entity,
there is a separate section which identifies how that entity is configured.
Thus, for a logger named ``log01`` in the ``[loggers]`` section, the relevant
configuration details are held in a section ``[logger_log01]``. Similarly, a
handler called ``hand01`` in the ``[handlers]`` section will have its
configuration held in a section called ``[handler_hand01]``, while a formatter
called ``form01`` in the ``[formatters]`` section will have its configuration
specified in a section called ``[formatter_form01]``. The root logger
configuration must be specified in a section called ``[logger_root]``.
Examples of these sections in the file are given below. :: >
[loggers]
keys=root,log02,log03,log04,log05,log06,log07
[handlers]
keys=hand01,hand02,hand03,hand04,hand05,hand06,hand07,hand08,hand09
[formatters]
keys=form01,form02,form03,form04,form05,form06,form07,form08,form09
<
The root logger must specify a level and a list of handlers. An example of a
root logger section is given below. :: >
[logger_root]
level=NOTSET
handlers=hand01
<
The ``level`` entry can be one of ``DEBUG, INFO, WARNING, ERROR, CRITICAL`` or
``NOTSET``. For the root logger only, ``NOTSET`` means that all messages will be
logged. Level values are eval\ uated in the context of the ``logging``
package's namespace.
The ``handlers`` entry is a comma-separated list of handler names, which must
appear in the ``[handlers]`` section. These names must appear in the
``[handlers]`` section and have corresponding sections in the configuration
file.
For loggers other than the root logger, some additional information is required.
This is illustrated by the following example. :: >
[logger_parser]
level=DEBUG
handlers=hand01
propagate=1
qualname=compiler.parser
<
The ``level`` and ``handlers`` entries are interpreted as for the root logger,
except that if a non-root logger's level is specified as ``NOTSET``, the system
consults loggers higher up the hierarchy to determine the effective level of the
logger. The ``propagate`` entry is set to 1 to indicate that messages must
propagate to handlers higher up the logger hierarchy from this logger, or 0 to
indicate that messages are {not}* propagated to handlers up the hierarchy. The
``qualname`` entry is the hierarchical channel name of the logger, that is to
say the name used by the application to get the logger.
Sections which specify handler configuration are exemplified by the following.
:: >
[handler_hand01]
class=StreamHandler
level=NOTSET
formatter=form01
args=(sys.stdout,)
<
The ``class`` entry indicates the handler's class (as determined by eval
in the ``logging`` package's namespace). The ``level`` is interpreted as for
loggers, and ``NOTSET`` is taken to mean "log everything".
.. versionchanged:: 2.6
Added support for resolving the handler's class as a dotted module and class
name.
The ``formatter`` entry indicates the key name of the formatter for this
handler. If blank, a default formatter (``logging._defaultFormatter``) is used.
If a name is specified, it must appear in the ``[formatters]`` section and have
a corresponding section in the configuration file.
The ``args`` entry, when eval\ uated in the context of the ``logging``
package's namespace, is the list of arguments to the constructor for the handler
class. Refer to the constructors for the relevant handlers, or to the examples
below, to see how typical entries are constructed. :: >
[handler_hand02]
class=FileHandler
level=DEBUG
formatter=form02
args=('python.log', 'w')
[handler_hand03]
class=handlers.SocketHandler
level=INFO
formatter=form03
args=('localhost', handlers.DEFAULT_TCP_LOGGING_PORT)
[handler_hand04]
class=handlers.DatagramHandler
level=WARN
formatter=form04
args=('localhost', handlers.DEFAULT_UDP_LOGGING_PORT)
[handler_hand05]
class=handlers.SysLogHandler
level=ERROR
formatter=form05
args=(('localhost', handlers.SYSLOG_UDP_PORT), handlers.SysLogHandler.LOG_USER)
[handler_hand06]
class=handlers.NTEventLogHandler
level=CRITICAL
formatter=form06
args=('Python Application', '', 'Application')
[handler_hand07]
class=handlers.SMTPHandler
level=WARN
formatter=form07
args=('localhost', 'from@abc', ['user1@abc', 'user2@xyz'], 'Logger Subject')
[handler_hand08]
class=handlers.MemoryHandler
level=NOTSET
formatter=form08
target=
args=(10, ERROR)
[handler_hand09]
class=handlers.HTTPHandler
level=NOTSET
formatter=form09
args=('localhost:9022', '/log', 'GET')
<
Sections which specify formatter configuration are typified by the following. ::
[formatter_form01]
format=F1 %(asctime)s %(levelname)s %(message)s
datefmt=
class=logging.Formatter
The ``format`` entry is the overall format string, and the ``datefmt`` entry is
the strftime\ -compatible date/time format string. If empty, the
package substitutes ISO8601 format date/times, which is almost equivalent to
specifying the date format string ``"%Y-%m-%d %H:%M:%S"``. The ISO8601 format
also specifies milliseconds, which are appended to the result of using the above
format string, with a comma separator. An example time in ISO8601 format is
``2003-01-23 00:29:50,411``.
The ``class`` entry is optional. It indicates the name of the formatter's class
(as a dotted module and class name.) This option is useful for instantiating a
Formatter subclass. Subclasses of Formatter can present
exception tracebacks in an expanded or condensed format.
Configuration server example
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Here is an example of a module using the logging configuration server:: >
import logging
import logging.config
import time
import os
# read initial config file
logging.config.fileConfig("logging.conf")
# create and start listener on port 9999
t = logging.config.listen(9999)
t.start()
logger = logging.getLogger("simpleExample")
try:
# loop through logging calls to see the difference
# new configurations make, until Ctrl+C is pressed
while True:
logger.debug("debug message")
logger.info("info message")
logger.warn("warn message")
logger.error("error message")
logger.critical("critical message")
time.sleep(5)
except KeyboardInterrupt:
# cleanup
logging.config.stopListening()
t.join()
<
And here is a script that takes a filename and sends that file to the server,
properly preceded with the binary-encoded length, as the new logging
configuration:: >
#!/usr/bin/env python
import socket, sys, struct
data_to_send = open(sys.argv[1], "r").read()
HOST = 'localhost'
PORT = 9999
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
print "connecting..."
s.connect((HOST, PORT))
print "sending config..."
s.send(struct.pack(">L", len(data_to_send)))
s.send(data_to_send)
s.close()
print "complete"
<
More examples
Multiple handlers and formatters
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Loggers are plain Python objects. The addHandler method has no minimum
or maximum quota for the number of handlers you may add. Sometimes it will be
beneficial for an application to log all messages of all severities to a text
file while simultaneously logging errors or above to the console. To set this
up, simply configure the appropriate handlers. The logging calls in the
application code will remain unchanged. Here is a slight modification to the
previous simple module-based configuration example:: >
import logging
logger = logging.getLogger("simple_example")
logger.setLevel(logging.DEBUG)
# create file handler which logs even debug messages
fh = logging.FileHandler("spam.log")
fh.setLevel(logging.DEBUG)
# create console handler with a higher log level
ch = logging.StreamHandler()
ch.setLevel(logging.ERROR)
# create formatter and add it to the handlers
formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
ch.setFormatter(formatter)
fh.setFormatter(formatter)
# add the handlers to logger
logger.addHandler(ch)
logger.addHandler(fh)
# "application" code
logger.debug("debug message")
logger.info("info message")
logger.warn("warn message")
logger.error("error message")
logger.critical("critical message")
<
Notice that the "application" code does not care about multiple handlers. All
that changed was the addition and configuration of a new handler named {fh}.
The ability to create new handlers with higher- or lower-severity filters can be
very helpful when writing and testing an application. Instead of using many
``print`` statements for debugging, use ``logger.debug``: Unlike the print
statements, which you will have to delete or comment out later, the logger.debug
statements can remain intact in the source code and remain dormant until you
need them again. At that time, the only change that needs to happen is to
modify the severity level of the logger and/or handler to debug.
Using logging in multiple modules
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
It was mentioned above that multiple calls to
``logging.getLogger('someLogger')`` return a reference to the same logger
object. This is true not only within the same module, but also across modules
as long as it is in the same Python interpreter process. It is true for
references to the same object; additionally, application code can define and
configure a parent logger in one module and create (but not configure) a child
logger in a separate module, and all logger calls to the child will pass up to
the parent. Here is a main module:: >
import logging
import auxiliary_module
# create logger with "spam_application"
logger = logging.getLogger("spam_application")
logger.setLevel(logging.DEBUG)
# create file handler which logs even debug messages
fh = logging.FileHandler("spam.log")
fh.setLevel(logging.DEBUG)
# create console handler with a higher log level
ch = logging.StreamHandler()
ch.setLevel(logging.ERROR)
# create formatter and add it to the handlers
formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
fh.setFormatter(formatter)
ch.setFormatter(formatter)
# add the handlers to the logger
logger.addHandler(fh)
logger.addHandler(ch)
logger.info("creating an instance of auxiliary_module.Auxiliary")
a = auxiliary_module.Auxiliary()
logger.info("created an instance of auxiliary_module.Auxiliary")
logger.info("calling auxiliary_module.Auxiliary.do_something")
a.do_something()
logger.info("finished auxiliary_module.Auxiliary.do_something")
logger.info("calling auxiliary_module.some_function()")
auxiliary_module.some_function()
logger.info("done with auxiliary_module.some_function()")
<
Here is the auxiliary module::
import logging
# create logger
module_logger = logging.getLogger("spam_application.auxiliary")
class Auxiliary:
def __init__(self):
self.logger = logging.getLogger("spam_application.auxiliary.Auxiliary")
self.logger.info("creating an instance of Auxiliary")
def do_something(self):
self.logger.info("doing something")
a = 1 + 1
self.logger.info("done doing something")
def some_function():
module_logger.info("received a call to \"some_function\"")
The output looks like this:: >
2005-03-23 23:47:11,663 - spam_application - INFO -
creating an instance of auxiliary_module.Auxiliary
2005-03-23 23:47:11,665 - spam_application.auxiliary.Auxiliary - INFO -
creating an instance of Auxiliary
2005-03-23 23:47:11,665 - spam_application - INFO -
created an instance of auxiliary_module.Auxiliary
2005-03-23 23:47:11,668 - spam_application - INFO -
calling auxiliary_module.Auxiliary.do_something
2005-03-23 23:47:11,668 - spam_application.auxiliary.Auxiliary - INFO -
doing something
2005-03-23 23:47:11,669 - spam_application.auxiliary.Auxiliary - INFO -
done doing something
2005-03-23 23:47:11,670 - spam_application - INFO -
finished auxiliary_module.Auxiliary.do_something
2005-03-23 23:47:11,671 - spam_application - INFO -
calling auxiliary_module.some_function()
2005-03-23 23:47:11,672 - spam_application.auxiliary - INFO -
received a call to "some_function"
2005-03-23 23:47:11,673 - spam_application - INFO -
done with auxiliary_module.some_function()
==============================================================================
*py2stdlib-macos*
MacOS~
:platform: Mac
:synopsis: Access to Mac OS-specific interpreter features.
:deprecated:
This module provides access to MacOS specific functionality in the Python
interpreter, such as how the interpreter eventloop functions and the like. Use
with care.
.. note::
This module has been removed in Python 3.x.
Note the capitalization of the module name; this is a historical artifact.
runtimemodel~
Always ``'macho'``, from Python 2.4 on. In earlier versions of Python the value
could also be ``'ppc'`` for the classic Mac OS 8 runtime model or ``'carbon'``
for the Mac OS 9 runtime model.
linkmodel~
The way the interpreter has been linked. As extension modules may be
incompatible between linking models, packages could use this information to give
more decent error messages. The value is one of ``'static'`` for a statically
linked Python, ``'framework'`` for Python in a Mac OS X framework, ``'shared'``
for Python in a standard Unix shared library. Older Pythons could also have the
value ``'cfm'`` for Mac OS 9-compatible Python.
Error~
.. index:: module: macerrors
This exception is raised on MacOS generated errors, either from functions in
this module or from other mac-specific modules like the toolbox interfaces. The
arguments are the integer error code (the OSErr value) and a textual
description of the error code. Symbolic names for all known error codes are
defined in the standard module macerrors (|py2stdlib-macerrors|).
GetErrorString(errno)~
Return the textual description of MacOS error code {errno}.
DebugStr(message [, object])~
On Mac OS X the string is simply printed to stderr (on older Mac OS systems more
elaborate functionality was available), but it provides a convenient location to
attach a breakpoint in a low-level debugger like gdb.
.. note:: >
Not available in 64-bit mode.
<
SysBeep()~
Ring the bell.
.. note:: >
Not available in 64-bit mode.
<
GetTicks()~
Get the number of clock ticks (1/60th of a second) since system boot.
GetCreatorAndType(file)~
Return the file creator and file type as two four-character strings. The {file}
parameter can be a pathname or an ``FSSpec`` or ``FSRef`` object.
.. note:: >
It is not possible to use an ``FSSpec`` in 64-bit mode.
<
SetCreatorAndType(file, creator, type)~
Set the file creator and file type. The {file} parameter can be a pathname or an
``FSSpec`` or ``FSRef`` object. {creator} and {type} must be four character
strings.
.. note:: >
It is not possible to use an ``FSSpec`` in 64-bit mode.
<
openrf(name [, mode])~
Open the resource fork of a file. Arguments are the same as for the built-in
function open. The object returned has file-like semantics, but it is
not a Python file object, so there may be subtle differences.
WMAvailable()~
Checks whether the current process has access to the window manager. The method
will return ``False`` if the window manager is not available, for instance when
running on Mac OS X Server or when logged in via ssh, or when the current
interpreter is not running from a fullblown application bundle. A script runs
from an application bundle either when it has been started with
pythonw instead of python or when running as an applet.
splash([resourceid])~
Opens a splash screen by resource id. Use resourceid ``0`` to close
the splash screen.
.. note:: >
Not available in 64-bit mode.
==============================================================================
*py2stdlib-macostools*
macostools~
:platform: Mac
:synopsis: Convenience routines for file manipulation.
:deprecated:
This module contains some convenience routines for file-manipulation on the
Macintosh. All file parameters can be specified as pathnames, FSRef or
FSSpec objects. This module expects a filesystem which supports forked
files, so it should not be used on UFS partitions.
.. note::
This module has been removed in Python 3.0.
The macostools (|py2stdlib-macostools|) module defines the following functions:
copy(src, dst[, createpath[, copytimes]])~
Copy file {src} to {dst}. If {createpath} is non-zero the folders leading to
{dst} are created if necessary. The method copies data and resource fork and
some finder information (creator, type, flags) and optionally the creation,
modification and backup times (default is to copy them). Custom icons, comments
and icon position are not copied.
.. note:: >
This function does not work in 64-bit code because it uses APIs that
are not available in 64-bit mode.
<
copytree(src, dst)~
Recursively copy a file tree from {src} to {dst}, creating folders as needed.
{src} and {dst} should be specified as pathnames.
.. note:: >
This function does not work in 64-bit code because it uses APIs that
are not available in 64-bit mode.
<
mkalias(src, dst)~
Create a finder alias {dst} pointing to {src}.
.. note:: >
This function does not work in 64-bit code because it uses APIs that
are not available in 64-bit mode.
<
touched(dst)~
Tell the finder that some bits of finder-information such as creator or type for
file {dst} has changed. The file can be specified by pathname or fsspec. This
call should tell the finder to redraw the files icon.
2.6~
The function is a no-op on OS X.
BUFSIZ~
The buffer size for ``copy``, default 1 megabyte.
Note that the process of creating finder aliases is not specified in the Apple
documentation. Hence, aliases created with mkalias could conceivably
have incompatible behaviour in some cases.
findertools (|py2stdlib-findertools|) --- The finder's Apple Events interface
=====================================================================
==============================================================================
*py2stdlib-macpath*
macpath~
:synopsis: Mac OS 9 path manipulation functions.
This module is the Mac OS 9 (and earlier) implementation of the os.path (|py2stdlib-os.path|)
module. It can be used to manipulate old-style Macintosh pathnames on Mac OS X
(or any other platform).
The following functions are available in this module: normcase,
normpath, isabs, join, split, isdir,
isfile, walk, exists. For other functions available in
os.path (|py2stdlib-os.path|) dummy counterparts are available.
==============================================================================
*py2stdlib-mailbox*
mailbox~
:synopsis: Manipulate mailboxes in various formats
This module defines two classes, Mailbox and Message, for
accessing and manipulating on-disk mailboxes and the messages they contain.
Mailbox offers a dictionary-like mapping from keys to messages.
Message extends the email.Message module's Message
class with format-specific state and behavior. Supported mailbox formats are
Maildir, mbox, MH, Babyl, and MMDF.
.. seealso::
Module email (|py2stdlib-email|)
Represent and manipulate messages.
Mailbox objects
------------------------
Mailbox~
A mailbox, which may be inspected and modified.
The Mailbox class defines an interface and is not intended to be
instantiated. Instead, format-specific subclasses should inherit from
Mailbox and your code should instantiate a particular subclass.
The Mailbox interface is dictionary-like, with small keys
corresponding to messages. Keys are issued by the Mailbox instance
with which they will be used and are only meaningful to that Mailbox
instance. A key continues to identify a message even if the corresponding
message is modified, such as by replacing it with another message.
Messages may be added to a Mailbox instance using the set-like
method add and removed using a ``del`` statement or the set-like
methods remove and discard.
Mailbox interface semantics differ from dictionary semantics in some
noteworthy ways. Each time a message is requested, a new representation
(typically a Message instance) is generated based upon the current
state of the mailbox. Similarly, when a message is added to a
Mailbox instance, the provided message representation's contents are
copied. In neither case is a reference to the message representation kept by
the Mailbox instance.
The default Mailbox iterator iterates over message representations,
not keys as the default dictionary iterator does. Moreover, modification of a
mailbox during iteration is safe and well-defined. Messages added to the
mailbox after an iterator is created will not be seen by the
iterator. Messages removed from the mailbox before the iterator yields them
will be silently skipped, though using a key from an iterator may result in a
KeyError exception if the corresponding message is subsequently
removed.
.. warning:: >
Be very cautious when modifying mailboxes that might be simultaneously
changed by some other process. The safest mailbox format to use for such
tasks is Maildir; try to avoid using single-file formats such as mbox for
concurrent writing. If you're modifying a mailbox, you {must} lock it by
calling the lock and unlock methods {before} reading any
messages in the file or making any changes by adding or deleting a
message. Failing to lock the mailbox runs the risk of losing messages or
corrupting the entire mailbox.
<
Mailbox instances have the following methods:
add(message)~
Add {message} to the mailbox and return the key that has been assigned to
it.
Parameter {message} may be a Message instance, an
email.Message.Message instance, a string, or a file-like object
(which should be open in text mode). If {message} is an instance of the
appropriate format-specific Message subclass (e.g., if it's an
mboxMessage instance and this is an mbox instance), its
format-specific information is used. Otherwise, reasonable defaults for
format-specific information are used.
remove(key)~
__delitem__(key)
discard(key)
Delete the message corresponding to {key} from the mailbox.
If no such message exists, a KeyError exception is raised if the
method was called as remove or __delitem__ but no
exception is raised if the method was called as discard. The
behavior of discard may be preferred if the underlying mailbox
format supports concurrent modification by other processes.
__setitem__(key, message)~
Replace the message corresponding to {key} with {message}. Raise a
KeyError exception if no message already corresponds to {key}.
As with add, parameter {message} may be a Message
instance, an email.Message.Message instance, a string, or a
file-like object (which should be open in text mode). If {message} is an
instance of the appropriate format-specific Message subclass
(e.g., if it's an mboxMessage instance and this is an
mbox instance), its format-specific information is
used. Otherwise, the format-specific information of the message that
currently corresponds to {key} is left unchanged.
iterkeys()~
keys()
Return an iterator over all keys if called as iterkeys or return a
list of keys if called as keys.
itervalues()~
__iter__()
values()
Return an iterator over representations of all messages if called as
itervalues or __iter__ or return a list of such
representations if called as values. The messages are represented
as instances of the appropriate format-specific Message subclass
unless a custom message factory was specified when the Mailbox
instance was initialized.
.. note:: >
The behavior of __iter__ is unlike that of dictionaries, which
iterate over keys.
<
iteritems()~
items()
Return an iterator over ({key}, {message}) pairs, where {key} is a key and
{message} is a message representation, if called as iteritems or
return a list of such pairs if called as items. The messages are
represented as instances of the appropriate format-specific
Message subclass unless a custom message factory was specified
when the Mailbox instance was initialized.
get(key[, default=None])~
__getitem__(key)
Return a representation of the message corresponding to {key}. If no such
message exists, {default} is returned if the method was called as
get and a KeyError exception is raised if the method was
called as __getitem__. The message is represented as an instance
of the appropriate format-specific Message subclass unless a
custom message factory was specified when the Mailbox instance
was initialized.
get_message(key)~
Return a representation of the message corresponding to {key} as an
instance of the appropriate format-specific Message subclass, or
raise a KeyError exception if no such message exists.
get_string(key)~
Return a string representation of the message corresponding to {key}, or
raise a KeyError exception if no such message exists.
get_file(key)~
Return a file-like representation of the message corresponding to {key},
or raise a KeyError exception if no such message exists. The
file-like object behaves as if open in binary mode. This file should be
closed once it is no longer needed.
.. note:: >
Unlike other representations of messages, file-like representations are
not necessarily independent of the Mailbox instance that
created them or of the underlying mailbox. More specific documentation
is provided by each subclass.
<
has_key(key)~
__contains__(key)
Return ``True`` if {key} corresponds to a message, ``False`` otherwise.
__len__()~
Return a count of messages in the mailbox.
clear()~
Delete all messages from the mailbox.
pop(key[, default])~
Return a representation of the message corresponding to {key} and delete
the message. If no such message exists, return {default} if it was
supplied or else raise a KeyError exception. The message is
represented as an instance of the appropriate format-specific
Message subclass unless a custom message factory was specified
when the Mailbox instance was initialized.
popitem()~
Return an arbitrary ({key}, {message}) pair, where {key} is a key and
{message} is a message representation, and delete the corresponding
message. If the mailbox is empty, raise a KeyError exception. The
message is represented as an instance of the appropriate format-specific
Message subclass unless a custom message factory was specified
when the Mailbox instance was initialized.
update(arg)~
Parameter {arg} should be a {key}-to-{message} mapping or an iterable of
({key}, {message}) pairs. Updates the mailbox so that, for each given
{key} and {message}, the message corresponding to {key} is set to
{message} as if by using __setitem__. As with __setitem__,
each {key} must already correspond to a message in the mailbox or else a
KeyError exception will be raised, so in general it is incorrect
for {arg} to be a Mailbox instance.
.. note:: >
Unlike with dictionaries, keyword arguments are not supported.
<
flush()~
Write any pending changes to the filesystem. For some Mailbox
subclasses, changes are always written immediately and flush does
nothing, but you should still make a habit of calling this method.
lock()~
Acquire an exclusive advisory lock on the mailbox so that other processes
know not to modify it. An ExternalClashError is raised if the lock
is not available. The particular locking mechanisms used depend upon the
mailbox format. You should {always} lock the mailbox before making any
modifications to its contents.
unlock()~
Release the lock on the mailbox, if any.
close()~
Flush the mailbox, unlock it if necessary, and close any open files. For
some Mailbox subclasses, this method does nothing.
Maildir
^^^^^^^^^^^^^^^^
Maildir(dirname[, factory=rfc822.Message[, create=True]])~
A subclass of Mailbox for mailboxes in Maildir format. Parameter
{factory} is a callable object that accepts a file-like message representation
(which behaves as if opened in binary mode) and returns a custom representation.
If {factory} is ``None``, MaildirMessage is used as the default message
representation. If {create} is ``True``, the mailbox is created if it does not
exist.
It is for historical reasons that {factory} defaults to rfc822.Message
and that {dirname} is named as such rather than {path}. For a Maildir
instance that behaves like instances of other Mailbox subclasses, set
{factory} to ``None``.
Maildir is a directory-based mailbox format invented for the qmail mail
transfer agent and now widely supported by other programs. Messages in a
Maildir mailbox are stored in separate files within a common directory
structure. This design allows Maildir mailboxes to be accessed and modified
by multiple unrelated programs without data corruption, so file locking is
unnecessary.
Maildir mailboxes contain three subdirectories, namely: tmp,
new (|py2stdlib-new|), and cur. Messages are created momentarily in the
tmp subdirectory and then moved to the new (|py2stdlib-new|) subdirectory to
finalize delivery. A mail user agent may subsequently move the message to the
cur subdirectory and store information about the state of the message
in a special "info" section appended to its file name.
Folders of the style introduced by the Courier mail transfer agent are also
supported. Any subdirectory of the main mailbox is considered a folder if
``'.'`` is the first character in its name. Folder names are represented by
Maildir without the leading ``'.'``. Each folder is itself a Maildir
mailbox but should not contain other folders. Instead, a logical nesting is
indicated using ``'.'`` to delimit levels, e.g., "Archived.2005.07".
.. note:: >
The Maildir specification requires the use of a colon (``':'``) in certain
message file names. However, some operating systems do not permit this
character in file names, If you wish to use a Maildir-like format on such
an operating system, you should specify another character to use
instead. The exclamation point (``'!'``) is a popular choice. For
example::
import mailbox
mailbox.Maildir.colon = '!'
The colon attribute may also be set on a per-instance basis.
<
Maildir instances have all of the methods of Mailbox in
addition to the following:
list_folders()~
Return a list of the names of all folders.
get_folder(folder)~
Return a Maildir instance representing the folder whose name is
{folder}. A NoSuchMailboxError exception is raised if the folder
does not exist.
add_folder(folder)~
Create a folder whose name is {folder} and return a Maildir
instance representing it.
remove_folder(folder)~
Delete the folder whose name is {folder}. If the folder contains any
messages, a NotEmptyError exception will be raised and the folder
will not be deleted.
clean()~
Delete temporary files from the mailbox that have not been accessed in the
last 36 hours. The Maildir specification says that mail-reading programs
should do this occasionally.
Some Mailbox methods implemented by Maildir deserve special
remarks:
add(message)~
__setitem__(key, message)
update(arg)
.. warning:: >
These methods generate unique file names based upon the current process
ID. When using multiple threads, undetected name clashes may occur and
cause corruption of the mailbox unless threads are coordinated to avoid
using these methods to manipulate the same mailbox simultaneously.
<
flush()~
All changes to Maildir mailboxes are immediately applied, so this method
does nothing.
lock()~
unlock()
Maildir mailboxes do not support (or require) locking, so these methods do
nothing.
close()~
Maildir instances do not keep any open files and the underlying
mailboxes do not support locking, so this method does nothing.
get_file(key)~
Depending upon the host platform, it may not be possible to modify or
remove the underlying message while the returned file remains open.
.. seealso::
`maildir man page from qmail <http://www.qmail.org/man/man5/maildir.html>`_
The original specification of the format.
`Using maildir format <http://cr.yp.to/proto/maildir.html>`_
Notes on Maildir by its inventor. Includes an updated name-creation scheme and
details on "info" semantics.
`maildir man page from Courier <http://www.courier-mta.org/maildir.html>`_
Another specification of the format. Describes a common extension for supporting
folders.
mbox
^^^^^^^^^^^^^
mbox(path[, factory=None[, create=True]])~
A subclass of Mailbox for mailboxes in mbox format. Parameter {factory}
is a callable object that accepts a file-like message representation (which
behaves as if opened in binary mode) and returns a custom representation. If
{factory} is ``None``, mboxMessage is used as the default message
representation. If {create} is ``True``, the mailbox is created if it does not
exist.
The mbox format is the classic format for storing mail on Unix systems. All
messages in an mbox mailbox are stored in a single file with the beginning of
each message indicated by a line whose first five characters are "From ".
Several variations of the mbox format exist to address perceived shortcomings in
the original. In the interest of compatibility, mbox implements the
original format, which is sometimes referred to as mboxo. This means that
the Content-Length header, if present, is ignored and that any
occurrences of "From " at the beginning of a line in a message body are
transformed to ">From " when storing the message, although occurrences of ">From
" are not transformed to "From " when reading the message.
Some Mailbox methods implemented by mbox deserve special
remarks:
get_file(key)~
Using the file after calling flush or close on the
mbox instance may yield unpredictable results or raise an
exception.
lock()~
unlock()
Three locking mechanisms are used---dot locking and, if available, the
flock and lockf system calls.
.. seealso::
`mbox man page from qmail <http://www.qmail.org/man/man5/mbox.html>`_
A specification of the format and its variations.
`mbox man page from tin <http://www.tin.org/bin/man.cgi?section=5&topic=mbox>`_
Another specification of the format, with details on locking.
`Configuring Netscape Mail on Unix: Why The Content-Length Format is Bad <http://www.jwz.org/doc/content-length.html>`_
An argument for using the original mbox format rather than a variation.
`"mbox" is a family of several mutually incompatible mailbox formats <http://homepages.tesco.net./~J.deBoynePollard/FGA/mail-mbox-formats.html>`_
A history of mbox variations.
MH
^^^^^^^^^^^
MH(path[, factory=None[, create=True]])~
A subclass of Mailbox for mailboxes in MH format. Parameter {factory}
is a callable object that accepts a file-like message representation (which
behaves as if opened in binary mode) and returns a custom representation. If
{factory} is ``None``, MHMessage is used as the default message
representation. If {create} is ``True``, the mailbox is created if it does not
exist.
MH is a directory-based mailbox format invented for the MH Message Handling
System, a mail user agent. Each message in an MH mailbox resides in its own
file. An MH mailbox may contain other MH mailboxes (called folders) in
addition to messages. Folders may be nested indefinitely. MH mailboxes also
support sequences, which are named lists used to logically group
messages without moving them to sub-folders. Sequences are defined in a file
called .mh_sequences in each folder.
The MH class manipulates MH mailboxes, but it does not attempt to
emulate all of mh's behaviors. In particular, it does not modify
and is not affected by the context or .mh_profile files that
are used by mh to store its state and configuration.
MH instances have all of the methods of Mailbox in addition
to the following:
list_folders()~
Return a list of the names of all folders.
get_folder(folder)~
Return an MH instance representing the folder whose name is
{folder}. A NoSuchMailboxError exception is raised if the folder
does not exist.
add_folder(folder)~
Create a folder whose name is {folder} and return an MH instance
representing it.
remove_folder(folder)~
Delete the folder whose name is {folder}. If the folder contains any
messages, a NotEmptyError exception will be raised and the folder
will not be deleted.
get_sequences()~
Return a dictionary of sequence names mapped to key lists. If there are no
sequences, the empty dictionary is returned.
set_sequences(sequences)~
Re-define the sequences that exist in the mailbox based upon {sequences},
a dictionary of names mapped to key lists, like returned by
get_sequences.
pack()~
Rename messages in the mailbox as necessary to eliminate gaps in
numbering. Entries in the sequences list are updated correspondingly.
.. note:: >
Already-issued keys are invalidated by this operation and should not be
subsequently used.
<
Some Mailbox methods implemented by MH deserve special
remarks:
remove(key)~
__delitem__(key)
discard(key)
These methods immediately delete the message. The MH convention of marking
a message for deletion by prepending a comma to its name is not used.
lock()~
unlock()
Three locking mechanisms are used---dot locking and, if available, the
flock and lockf system calls. For MH mailboxes, locking
the mailbox means locking the .mh_sequences file and, only for the
duration of any operations that affect them, locking individual message
files.
get_file(key)~
Depending upon the host platform, it may not be possible to remove the
underlying message while the returned file remains open.
flush()~
All changes to MH mailboxes are immediately applied, so this method does
nothing.
close()~
MH instances do not keep any open files, so this method is
equivalent to unlock.
.. seealso::
`nmh - Message Handling System <http://www.nongnu.org/nmh/>`_
Home page of nmh, an updated version of the original mh.
`MH & nmh: Email for Users & Programmers <http://rand-mh.sourceforge.net/book/>`_
A GPL-licensed book on mh and nmh, with some information
on the mailbox format.
Babyl
^^^^^^^^^^^^^^
Babyl(path[, factory=None[, create=True]])~
A subclass of Mailbox for mailboxes in Babyl format. Parameter
{factory} is a callable object that accepts a file-like message representation
(which behaves as if opened in binary mode) and returns a custom representation.
If {factory} is ``None``, BabylMessage is used as the default message
representation. If {create} is ``True``, the mailbox is created if it does not
exist.
Babyl is a single-file mailbox format used by the Rmail mail user agent
included with Emacs. The beginning of a message is indicated by a line
containing the two characters Control-Underscore (``'\037'``) and Control-L
(``'\014'``). The end of a message is indicated by the start of the next
message or, in the case of the last message, a line containing a
Control-Underscore (``'\037'``) character.
Messages in a Babyl mailbox have two sets of headers, original headers and
so-called visible headers. Visible headers are typically a subset of the
original headers that have been reformatted or abridged to be more
attractive. Each message in a Babyl mailbox also has an accompanying list of
labels, or short strings that record extra information about the
message, and a list of all user-defined labels found in the mailbox is kept
in the Babyl options section.
Babyl instances have all of the methods of Mailbox in
addition to the following:
get_labels()~
Return a list of the names of all user-defined labels used in the mailbox.
.. note:: >
The actual messages are inspected to determine which labels exist in
the mailbox rather than consulting the list of labels in the Babyl
options section, but the Babyl section is updated whenever the mailbox
is modified.
<
Some Mailbox methods implemented by Babyl deserve special
remarks:
get_file(key)~
In Babyl mailboxes, the headers of a message are not stored contiguously
with the body of the message. To generate a file-like representation, the
headers and body are copied together into a StringIO (|py2stdlib-stringio|) instance
(from the StringIO (|py2stdlib-stringio|) module), which has an API identical to that of a
file. As a result, the file-like object is truly independent of the
underlying mailbox but does not save memory compared to a string
representation.
lock()~
unlock()
Three locking mechanisms are used---dot locking and, if available, the
flock and lockf system calls.
.. seealso::
`Format of Version 5 Babyl Files <http://quimby.gnus.org/notes/BABYL>`_
A specification of the Babyl format.
`Reading Mail with Rmail <http://www.gnu.org/software/emacs/manual/html_node/emacs/Rmail.html>`_
The Rmail manual, with some information on Babyl semantics.
MMDF
^^^^^^^^^^^^^
MMDF(path[, factory=None[, create=True]])~
A subclass of Mailbox for mailboxes in MMDF format. Parameter {factory}
is a callable object that accepts a file-like message representation (which
behaves as if opened in binary mode) and returns a custom representation. If
{factory} is ``None``, MMDFMessage is used as the default message
representation. If {create} is ``True``, the mailbox is created if it does not
exist.
MMDF is a single-file mailbox format invented for the Multichannel Memorandum
Distribution Facility, a mail transfer agent. Each message is in the same
form as an mbox message but is bracketed before and after by lines containing
four Control-A (``'\001'``) characters. As with the mbox format, the
beginning of each message is indicated by a line whose first five characters
are "From ", but additional occurrences of "From " are not transformed to
">From " when storing messages because the extra message separator lines
prevent mistaking such occurrences for the starts of subsequent messages.
Some Mailbox methods implemented by MMDF deserve special
remarks:
get_file(key)~
Using the file after calling flush or close on the
MMDF instance may yield unpredictable results or raise an
exception.
lock()~
unlock()
Three locking mechanisms are used---dot locking and, if available, the
flock and lockf system calls.
.. seealso::
`mmdf man page from tin <http://www.tin.org/bin/man.cgi?section=5&topic=mmdf>`_
A specification of MMDF format from the documentation of tin, a newsreader.
`MMDF <http://en.wikipedia.org/wiki/MMDF>`_
A Wikipedia article describing the Multichannel Memorandum Distribution
Facility.
Message objects
------------------------
Message([message])~
A subclass of the email.Message module's Message. Subclasses of
mailbox.Message add mailbox-format-specific state and behavior.
If {message} is omitted, the new instance is created in a default, empty state.
If {message} is an email.Message.Message instance, its contents are
copied; furthermore, any format-specific information is converted insofar as
possible if {message} is a Message instance. If {message} is a string
or a file, it should contain an 2822\ -compliant message, which is read
and parsed.
The format-specific state and behaviors offered by subclasses vary, but in
general it is only the properties that are not specific to a particular
mailbox that are supported (although presumably the properties are specific
to a particular mailbox format). For example, file offsets for single-file
mailbox formats and file names for directory-based mailbox formats are not
retained, because they are only applicable to the original mailbox. But state
such as whether a message has been read by the user or marked as important is
retained, because it applies to the message itself.
There is no requirement that Message instances be used to represent
messages retrieved using Mailbox instances. In some situations, the
time and memory required to generate Message representations might
not not acceptable. For such situations, Mailbox instances also
offer string and file-like representations, and a custom message factory may
be specified when a Mailbox instance is initialized.
MaildirMessage
^^^^^^^^^^^^^^^^^^^^^^^
MaildirMessage([message])~
A message with Maildir-specific behaviors. Parameter {message} has the same
meaning as with the Message constructor.
Typically, a mail user agent application moves all of the messages in the
new (|py2stdlib-new|) subdirectory to the cur subdirectory after the first time
the user opens and closes the mailbox, recording that the messages are old
whether or not they've actually been read. Each message in cur has an
"info" section added to its file name to store information about its state.
(Some mail readers may also add an "info" section to messages in
new (|py2stdlib-new|).) The "info" section may take one of two forms: it may contain
"2," followed by a list of standardized flags (e.g., "2,FR") or it may
contain "1," followed by so-called experimental information. Standard flags
for Maildir messages are as follows:
+------+---------+--------------------------------+
| Flag | Meaning | Explanation |
+======+=========+================================+
| D | Draft | Under composition |
+------+---------+--------------------------------+
| F | Flagged | Marked as important |
+------+---------+--------------------------------+
| P | Passed | Forwarded, resent, or bounced |
+------+---------+--------------------------------+
| R | Replied | Replied to |
+------+---------+--------------------------------+
| S | Seen | Read |
+------+---------+--------------------------------+
| T | Trashed | Marked for subsequent deletion |
+------+---------+--------------------------------+
MaildirMessage instances offer the following methods:
get_subdir()~
Return either "new" (if the message should be stored in the new (|py2stdlib-new|)
subdirectory) or "cur" (if the message should be stored in the cur
subdirectory).
.. note:: >
A message is typically moved from new (|py2stdlib-new|) to cur after its
mailbox has been accessed, whether or not the message is has been
read. A message ``msg`` has been read if ``"S" in msg.get_flags()`` is
``True``.
<
set_subdir(subdir)~
Set the subdirectory the message should be stored in. Parameter {subdir}
must be either "new" or "cur".
get_flags()~
Return a string specifying the flags that are currently set. If the
message complies with the standard Maildir format, the result is the
concatenation in alphabetical order of zero or one occurrence of each of
``'D'``, ``'F'``, ``'P'``, ``'R'``, ``'S'``, and ``'T'``. The empty string
is returned if no flags are set or if "info" contains experimental
semantics.
set_flags(flags)~
Set the flags specified by {flags} and unset all others.
add_flag(flag)~
Set the flag(s) specified by {flag} without changing other flags. To add
more than one flag at a time, {flag} may be a string of more than one
character. The current "info" is overwritten whether or not it contains
experimental information rather than flags.
remove_flag(flag)~
Unset the flag(s) specified by {flag} without changing other flags. To
remove more than one flag at a time, {flag} maybe a string of more than
one character. If "info" contains experimental information rather than
flags, the current "info" is not modified.
get_date()~
Return the delivery date of the message as a floating-point number
representing seconds since the epoch.
set_date(date)~
Set the delivery date of the message to {date}, a floating-point number
representing seconds since the epoch.
get_info()~
Return a string containing the "info" for a message. This is useful for
accessing and modifying "info" that is experimental (i.e., not a list of
flags).
set_info(info)~
Set "info" to {info}, which should be a string.
When a MaildirMessage instance is created based upon an
mboxMessage or MMDFMessage instance, the Status
and X-Status headers are omitted and the following conversions
take place:
+--------------------+----------------------------------------------+
| Resulting state | mboxMessage or MMDFMessage |
| | state |
+====================+==============================================+
| "cur" subdirectory | O flag |
+--------------------+----------------------------------------------+
| F flag | F flag |
+--------------------+----------------------------------------------+
| R flag | A flag |
+--------------------+----------------------------------------------+
| S flag | R flag |
+--------------------+----------------------------------------------+
| T flag | D flag |
+--------------------+----------------------------------------------+
When a MaildirMessage instance is created based upon an
MHMessage instance, the following conversions take place:
+-------------------------------+--------------------------+
| Resulting state | MHMessage state |
+===============================+==========================+
| "cur" subdirectory | "unseen" sequence |
+-------------------------------+--------------------------+
| "cur" subdirectory and S flag | no "unseen" sequence |
+-------------------------------+--------------------------+
| F flag | "flagged" sequence |
+-------------------------------+--------------------------+
| R flag | "replied" sequence |
+-------------------------------+--------------------------+
When a MaildirMessage instance is created based upon a
BabylMessage instance, the following conversions take place:
+-------------------------------+-------------------------------+
| Resulting state | BabylMessage state |
+===============================+===============================+
| "cur" subdirectory | "unseen" label |
+-------------------------------+-------------------------------+
| "cur" subdirectory and S flag | no "unseen" label |
+-------------------------------+-------------------------------+
| P flag | "forwarded" or "resent" label |
+-------------------------------+-------------------------------+
| R flag | "answered" label |
+-------------------------------+-------------------------------+
| T flag | "deleted" label |
+-------------------------------+-------------------------------+
mboxMessage
^^^^^^^^^^^^^^^^^^^^
mboxMessage([message])~
A message with mbox-specific behaviors. Parameter {message} has the same meaning
as with the Message constructor.
Messages in an mbox mailbox are stored together in a single file. The
sender's envelope address and the time of delivery are typically stored in a
line beginning with "From " that is used to indicate the start of a message,
though there is considerable variation in the exact format of this data among
mbox implementations. Flags that indicate the state of the message, such as
whether it has been read or marked as important, are typically stored in
Status and X-Status headers.
Conventional flags for mbox messages are as follows:
+------+----------+--------------------------------+
| Flag | Meaning | Explanation |
+======+==========+================================+
| R | Read | Read |
+------+----------+--------------------------------+
| O | Old | Previously detected by MUA |
+------+----------+--------------------------------+
| D | Deleted | Marked for subsequent deletion |
+------+----------+--------------------------------+
| F | Flagged | Marked as important |
+------+----------+--------------------------------+
| A | Answered | Replied to |
+------+----------+--------------------------------+
The "R" and "O" flags are stored in the Status header, and the
"D", "F", and "A" flags are stored in the X-Status header. The
flags and headers typically appear in the order mentioned.
mboxMessage instances offer the following methods:
get_from()~
Return a string representing the "From " line that marks the start of the
message in an mbox mailbox. The leading "From " and the trailing newline
are excluded.
set_from(from_[, time_=None])~
Set the "From " line to {from_}, which should be specified without a
leading "From " or trailing newline. For convenience, {time_} may be
specified and will be formatted appropriately and appended to {from_}. If
{time_} is specified, it should be a struct_time instance, a
tuple suitable for passing to time.strftime, or ``True`` (to use
time.gmtime).
get_flags()~
Return a string specifying the flags that are currently set. If the
message complies with the conventional format, the result is the
concatenation in the following order of zero or one occurrence of each of
``'R'``, ``'O'``, ``'D'``, ``'F'``, and ``'A'``.
set_flags(flags)~
Set the flags specified by {flags} and unset all others. Parameter {flags}
should be the concatenation in any order of zero or more occurrences of
each of ``'R'``, ``'O'``, ``'D'``, ``'F'``, and ``'A'``.
add_flag(flag)~
Set the flag(s) specified by {flag} without changing other flags. To add
more than one flag at a time, {flag} may be a string of more than one
character.
remove_flag(flag)~
Unset the flag(s) specified by {flag} without changing other flags. To
remove more than one flag at a time, {flag} maybe a string of more than
one character.
When an mboxMessage instance is created based upon a
MaildirMessage instance, a "From " line is generated based upon the
MaildirMessage instance's delivery date, and the following conversions
take place:
+-----------------+-------------------------------+
| Resulting state | MaildirMessage state |
+=================+===============================+
| R flag | S flag |
+-----------------+-------------------------------+
| O flag | "cur" subdirectory |
+-----------------+-------------------------------+
| D flag | T flag |
+-----------------+-------------------------------+
| F flag | F flag |
+-----------------+-------------------------------+
| A flag | R flag |
+-----------------+-------------------------------+
When an mboxMessage instance is created based upon an
MHMessage instance, the following conversions take place:
+-------------------+--------------------------+
| Resulting state | MHMessage state |
+===================+==========================+
| R flag and O flag | no "unseen" sequence |
+-------------------+--------------------------+
| O flag | "unseen" sequence |
+-------------------+--------------------------+
| F flag | "flagged" sequence |
+-------------------+--------------------------+
| A flag | "replied" sequence |
+-------------------+--------------------------+
When an mboxMessage instance is created based upon a
BabylMessage instance, the following conversions take place:
+-------------------+-----------------------------+
| Resulting state | BabylMessage state |
+===================+=============================+
| R flag and O flag | no "unseen" label |
+-------------------+-----------------------------+
| O flag | "unseen" label |
+-------------------+-----------------------------+
| D flag | "deleted" label |
+-------------------+-----------------------------+
| A flag | "answered" label |
+-------------------+-----------------------------+
When a Message instance is created based upon an MMDFMessage
instance, the "From " line is copied and all flags directly correspond:
+-----------------+----------------------------+
| Resulting state | MMDFMessage state |
+=================+============================+
| R flag | R flag |
+-----------------+----------------------------+
| O flag | O flag |
+-----------------+----------------------------+
| D flag | D flag |
+-----------------+----------------------------+
| F flag | F flag |
+-----------------+----------------------------+
| A flag | A flag |
+-----------------+----------------------------+
MHMessage
^^^^^^^^^^^^^^^^^^
MHMessage([message])~
A message with MH-specific behaviors. Parameter {message} has the same meaning
as with the Message constructor.
MH messages do not support marks or flags in the traditional sense, but they
do support sequences, which are logical groupings of arbitrary messages. Some
mail reading programs (although not the standard mh and
nmh) use sequences in much the same way flags are used with other
formats, as follows:
+----------+------------------------------------------+
| Sequence | Explanation |
+==========+==========================================+
| unseen | Not read, but previously detected by MUA |
+----------+------------------------------------------+
| replied | Replied to |
+----------+------------------------------------------+
| flagged | Marked as important |
+----------+------------------------------------------+
MHMessage instances offer the following methods:
get_sequences()~
Return a list of the names of sequences that include this message.
set_sequences(sequences)~
Set the list of sequences that include this message.
add_sequence(sequence)~
Add {sequence} to the list of sequences that include this message.
remove_sequence(sequence)~
Remove {sequence} from the list of sequences that include this message.
When an MHMessage instance is created based upon a
MaildirMessage instance, the following conversions take place:
+--------------------+-------------------------------+
| Resulting state | MaildirMessage state |
+====================+===============================+
| "unseen" sequence | no S flag |
+--------------------+-------------------------------+
| "replied" sequence | R flag |
+--------------------+-------------------------------+
| "flagged" sequence | F flag |
+--------------------+-------------------------------+
When an MHMessage instance is created based upon an
mboxMessage or MMDFMessage instance, the Status
and X-Status headers are omitted and the following conversions
take place:
+--------------------+----------------------------------------------+
| Resulting state | mboxMessage or MMDFMessage |
| | state |
+====================+==============================================+
| "unseen" sequence | no R flag |
+--------------------+----------------------------------------------+
| "replied" sequence | A flag |
+--------------------+----------------------------------------------+
| "flagged" sequence | F flag |
+--------------------+----------------------------------------------+
When an MHMessage instance is created based upon a
BabylMessage instance, the following conversions take place:
+--------------------+-----------------------------+
| Resulting state | BabylMessage state |
+====================+=============================+
| "unseen" sequence | "unseen" label |
+--------------------+-----------------------------+
| "replied" sequence | "answered" label |
+--------------------+-----------------------------+
BabylMessage
^^^^^^^^^^^^^^^^^^^^^
BabylMessage([message])~
A message with Babyl-specific behaviors. Parameter {message} has the same
meaning as with the Message constructor.
Certain message labels, called attributes, are defined by convention
to have special meanings. The attributes are as follows:
+-----------+------------------------------------------+
| Label | Explanation |
+===========+==========================================+
| unseen | Not read, but previously detected by MUA |
+-----------+------------------------------------------+
| deleted | Marked for subsequent deletion |
+-----------+------------------------------------------+
| filed | Copied to another file or mailbox |
+-----------+------------------------------------------+
| answered | Replied to |
+-----------+------------------------------------------+
| forwarded | Forwarded |
+-----------+------------------------------------------+
| edited | Modified by the user |
+-----------+------------------------------------------+
| resent | Resent |
+-----------+------------------------------------------+
By default, Rmail displays only visible headers. The BabylMessage
class, though, uses the original headers because they are more
complete. Visible headers may be accessed explicitly if desired.
BabylMessage instances offer the following methods:
get_labels()~
Return a list of labels on the message.
set_labels(labels)~
Set the list of labels on the message to {labels}.
add_label(label)~
Add {label} to the list of labels on the message.
remove_label(label)~
Remove {label} from the list of labels on the message.
get_visible()~
Return an Message instance whose headers are the message's
visible headers and whose body is empty.
set_visible(visible)~
Set the message's visible headers to be the same as the headers in
{message}. Parameter {visible} should be a Message instance, an
email.Message.Message instance, a string, or a file-like object
(which should be open in text mode).
update_visible()~
When a BabylMessage instance's original headers are modified, the
visible headers are not automatically modified to correspond. This method
updates the visible headers as follows: each visible header with a
corresponding original header is set to the value of the original header,
each visible header without a corresponding original header is removed,
and any of Date, From, Reply-To,
To, CC, and Subject that are
present in the original headers but not the visible headers are added to
the visible headers.
When a BabylMessage instance is created based upon a
MaildirMessage instance, the following conversions take place:
+-------------------+-------------------------------+
| Resulting state | MaildirMessage state |
+===================+===============================+
| "unseen" label | no S flag |
+-------------------+-------------------------------+
| "deleted" label | T flag |
+-------------------+-------------------------------+
| "answered" label | R flag |
+-------------------+-------------------------------+
| "forwarded" label | P flag |
+-------------------+-------------------------------+
When a BabylMessage instance is created based upon an
mboxMessage or MMDFMessage instance, the Status
and X-Status headers are omitted and the following conversions
take place:
+------------------+----------------------------------------------+
| Resulting state | mboxMessage or MMDFMessage |
| | state |
+==================+==============================================+
| "unseen" label | no R flag |
+------------------+----------------------------------------------+
| "deleted" label | D flag |
+------------------+----------------------------------------------+
| "answered" label | A flag |
+------------------+----------------------------------------------+
When a BabylMessage instance is created based upon an
MHMessage instance, the following conversions take place:
+------------------+--------------------------+
| Resulting state | MHMessage state |
+==================+==========================+
| "unseen" label | "unseen" sequence |
+------------------+--------------------------+
| "answered" label | "replied" sequence |
+------------------+--------------------------+
MMDFMessage
^^^^^^^^^^^^^^^^^^^^
MMDFMessage([message])~
A message with MMDF-specific behaviors. Parameter {message} has the same meaning
as with the Message constructor.
As with message in an mbox mailbox, MMDF messages are stored with the
sender's address and the delivery date in an initial line beginning with
"From ". Likewise, flags that indicate the state of the message are
typically stored in Status and X-Status headers.
Conventional flags for MMDF messages are identical to those of mbox message
and are as follows:
+------+----------+--------------------------------+
| Flag | Meaning | Explanation |
+======+==========+================================+
| R | Read | Read |
+------+----------+--------------------------------+
| O | Old | Previously detected by MUA |
+------+----------+--------------------------------+
| D | Deleted | Marked for subsequent deletion |
+------+----------+--------------------------------+
| F | Flagged | Marked as important |
+------+----------+--------------------------------+
| A | Answered | Replied to |
+------+----------+--------------------------------+
The "R" and "O" flags are stored in the Status header, and the
"D", "F", and "A" flags are stored in the X-Status header. The
flags and headers typically appear in the order mentioned.
MMDFMessage instances offer the following methods, which are
identical to those offered by mboxMessage:
get_from()~
Return a string representing the "From " line that marks the start of the
message in an mbox mailbox. The leading "From " and the trailing newline
are excluded.
set_from(from_[, time_=None])~
Set the "From " line to {from_}, which should be specified without a
leading "From " or trailing newline. For convenience, {time_} may be
specified and will be formatted appropriately and appended to {from_}. If
{time_} is specified, it should be a struct_time instance, a
tuple suitable for passing to time.strftime, or ``True`` (to use
time.gmtime).
get_flags()~
Return a string specifying the flags that are currently set. If the
message complies with the conventional format, the result is the
concatenation in the following order of zero or one occurrence of each of
``'R'``, ``'O'``, ``'D'``, ``'F'``, and ``'A'``.
set_flags(flags)~
Set the flags specified by {flags} and unset all others. Parameter {flags}
should be the concatenation in any order of zero or more occurrences of
each of ``'R'``, ``'O'``, ``'D'``, ``'F'``, and ``'A'``.
add_flag(flag)~
Set the flag(s) specified by {flag} without changing other flags. To add
more than one flag at a time, {flag} may be a string of more than one
character.
remove_flag(flag)~
Unset the flag(s) specified by {flag} without changing other flags. To
remove more than one flag at a time, {flag} maybe a string of more than
one character.
When an MMDFMessage instance is created based upon a
MaildirMessage instance, a "From " line is generated based upon the
MaildirMessage instance's delivery date, and the following conversions
take place:
+-----------------+-------------------------------+
| Resulting state | MaildirMessage state |
+=================+===============================+
| R flag | S flag |
+-----------------+-------------------------------+
| O flag | "cur" subdirectory |
+-----------------+-------------------------------+
| D flag | T flag |
+-----------------+-------------------------------+
| F flag | F flag |
+-----------------+-------------------------------+
| A flag | R flag |
+-----------------+-------------------------------+
When an MMDFMessage instance is created based upon an
MHMessage instance, the following conversions take place:
+-------------------+--------------------------+
| Resulting state | MHMessage state |
+===================+==========================+
| R flag and O flag | no "unseen" sequence |
+-------------------+--------------------------+
| O flag | "unseen" sequence |
+-------------------+--------------------------+
| F flag | "flagged" sequence |
+-------------------+--------------------------+
| A flag | "replied" sequence |
+-------------------+--------------------------+
When an MMDFMessage instance is created based upon a
BabylMessage instance, the following conversions take place:
+-------------------+-----------------------------+
| Resulting state | BabylMessage state |
+===================+=============================+
| R flag and O flag | no "unseen" label |
+-------------------+-----------------------------+
| O flag | "unseen" label |
+-------------------+-----------------------------+
| D flag | "deleted" label |
+-------------------+-----------------------------+
| A flag | "answered" label |
+-------------------+-----------------------------+
When an MMDFMessage instance is created based upon an
mboxMessage instance, the "From " line is copied and all flags directly
correspond:
+-----------------+----------------------------+
| Resulting state | mboxMessage state |
+=================+============================+
| R flag | R flag |
+-----------------+----------------------------+
| O flag | O flag |
+-----------------+----------------------------+
| D flag | D flag |
+-----------------+----------------------------+
| F flag | F flag |
+-----------------+----------------------------+
| A flag | A flag |
+-----------------+----------------------------+
Exceptions
----------
The following exception classes are defined in the mailbox (|py2stdlib-mailbox|) module:
Error()~
The based class for all other module-specific exceptions.
NoSuchMailboxError()~
Raised when a mailbox is expected but is not found, such as when instantiating a
Mailbox subclass with a path that does not exist (and with the {create}
parameter set to ``False``), or when opening a folder that does not exist.
NotEmptyError()~
Raised when a mailbox is not empty but is expected to be, such as when deleting
a folder that contains messages.
ExternalClashError()~
Raised when some mailbox-related condition beyond the control of the program
causes it to be unable to proceed, such as when failing to acquire a lock that
another program already holds a lock, or when a uniquely-generated file name
already exists.
FormatError()~
Raised when the data in a file cannot be parsed, such as when an MH
instance attempts to read a corrupted .mh_sequences file.
Deprecated classes and methods
------------------------------
2.6~
Older versions of the mailbox (|py2stdlib-mailbox|) module do not support modification of
mailboxes, such as adding or removing message, and do not provide classes to
represent format-specific message properties. For backward compatibility, the
older mailbox classes are still available, but the newer classes should be used
in preference to them. The old classes will be removed in Python 3.0.
Older mailbox objects support only iteration and provide a single public method:
oldmailbox.next()~
Return the next message in the mailbox, created with the optional {factory}
argument passed into the mailbox object's constructor. By default this is an
rfc822.Message object (see the rfc822 (|py2stdlib-rfc822|) module). Depending on the
mailbox implementation the {fp} attribute of this object may be a true file
object or a class instance simulating a file object, taking care of things like
message boundaries if multiple mail messages are contained in a single file,
etc. If no more messages are available, this method returns ``None``.
Most of the older mailbox classes have names that differ from the current
mailbox class names, except for Maildir. For this reason, the new
Maildir class defines a !next method and its constructor differs
slightly from those of the other new mailbox classes.
The older mailbox classes whose names are not the same as their newer
counterparts are as follows:
UnixMailbox(fp[, factory])~
Access to a classic Unix-style mailbox, where all messages are contained in a
single file and separated by ``From`` (a.k.a. ``From_``) lines. The file object
{fp} points to the mailbox file. The optional {factory} parameter is a callable
that should create new message objects. {factory} is called with one argument,
{fp} by the !next method of the mailbox object. The default is the
rfc822.Message class (see the rfc822 (|py2stdlib-rfc822|) module -- and the note
below).
.. note:: >
For reasons of this module's internal implementation, you will probably want to
open the {fp} object in binary mode. This is especially important on Windows.
<
For maximum portability, messages in a Unix-style mailbox are separated by any
line that begins exactly with the string ``'From '`` (note the trailing space)
if preceded by exactly two newlines. Because of the wide-range of variations in
practice, nothing else on the ``From_`` line should be considered. However, the
current implementation doesn't check for the leading two newlines. This is
usually fine for most applications.
The UnixMailbox class implements a more strict version of ``From_``
line checking, using a regular expression that usually correctly matched
``From_`` delimiters. It considers delimiter line to be separated by ``From
name time`` lines. For maximum portability, use the
PortableUnixMailbox class instead. This class is identical to
UnixMailbox except that individual messages are separated by only
``From`` lines.
PortableUnixMailbox(fp[, factory])~
A less-strict version of UnixMailbox, which considers only the ``From``
at the beginning of the line separating messages. The "{name} {time}" portion
of the From line is ignored, to protect against some variations that are
observed in practice. This works since lines in the message which begin with
``'From '`` are quoted by mail handling software at delivery-time.
MmdfMailbox(fp[, factory])~
Access an MMDF-style mailbox, where all messages are contained in a single file
and separated by lines consisting of 4 control-A characters. The file object
{fp} points to the mailbox file. Optional {factory} is as with the
UnixMailbox class.
MHMailbox(dirname[, factory])~
Access an MH mailbox, a directory with each message in a separate file with a
numeric name. The name of the mailbox directory is passed in {dirname}.
{factory} is as with the UnixMailbox class.
BabylMailbox(fp[, factory])~
Access a Babyl mailbox, which is similar to an MMDF mailbox. In Babyl format,
each message has two sets of headers, the {original} headers and the {visible}
headers. The original headers appear before a line containing only ``'{} EOOH
{}'`` (End-Of-Original-Headers) and the visible headers appear after the
``EOOH`` line. Babyl-compliant mail readers will show you only the visible
headers, and BabylMailbox objects will return messages containing only
the visible headers. You'll have to do your own parsing of the mailbox file to
get at the original headers. Mail messages start with the EOOH line and end
with a line containing only ``'\037\014'``. {factory} is as with the
UnixMailbox class.
If you wish to use the older mailbox classes with the email (|py2stdlib-email|) module rather
than the deprecated rfc822 (|py2stdlib-rfc822|) module, you can do so as follows:: >
import email
import email.Errors
import mailbox
def msgfactory(fp):
try:
return email.message_from_file(fp)
except email.Errors.MessageParseError:
# Don't return None since that will
# stop the mailbox iterator
return ''
mbox = mailbox.UnixMailbox(fp, msgfactory)
<
Alternatively, if you know your mailbox contains only well-formed MIME messages,
you can simplify this to:: >
import email
import mailbox
mbox = mailbox.UnixMailbox(fp, email.message_from_file)
<
Examples
A simple example of printing the subjects of all messages in a mailbox that seem
interesting:: >
import mailbox
for message in mailbox.mbox('~/mbox'):
subject = message['subject'] # Could possibly be None.
if subject and 'python' in subject.lower():
print subject
<
To copy all mail from a Babyl mailbox to an MH mailbox, converting all of the
format-specific information that can be converted:: >
import mailbox
destination = mailbox.MH('~/Mail')
destination.lock()
for message in mailbox.Babyl('~/RMAIL'):
destination.add(mailbox.MHMessage(message))
destination.flush()
destination.unlock()
<
This example sorts mail from several mailing lists into different mailboxes,
being careful to avoid mail corruption due to concurrent modification by other
programs, mail loss due to interruption of the program, or premature termination
due to malformed messages in the mailbox:: >
import mailbox
import email.Errors
list_names = ('python-list', 'python-dev', 'python-bugs')
boxes = dict((name, mailbox.mbox('~/email/%s' % name)) for name in list_names)
inbox = mailbox.Maildir('~/Maildir', factory=None)
for key in inbox.iterkeys():
try:
message = inbox[key]
except email.Errors.MessageParseError:
continue # The message is malformed. Just leave it.
for name in list_names:
list_id = message['list-id']
if list_id and name in list_id:
# Get mailbox to use
box = boxes[name]
# Write copy to disk before removing original.
# If there's a crash, you might duplicate a message, but
# that's better than losing a message completely.
box.lock()
box.add(message)
box.flush()
box.unlock()
# Remove original message
inbox.lock()
inbox.discard(key)
inbox.flush()
inbox.unlock()
break # Found destination, so stop looking.
for box in boxes.itervalues():
box.close()
==============================================================================
*py2stdlib-mailcap*
mailcap~
:synopsis: Mailcap file handling.
Mailcap files are used to configure how MIME-aware applications such as mail
readers and Web browsers react to files with different MIME types. (The name
"mailcap" is derived from the phrase "mail capability".) For example, a mailcap
file might contain a line like ``video/mpeg; xmpeg %s``. Then, if the user
encounters an email message or Web document with the MIME type
video/mpeg, ``%s`` will be replaced by a filename (usually one
belonging to a temporary file) and the xmpeg program can be
automatically started to view the file.
The mailcap format is documented in 1524, "A User Agent Configuration
Mechanism For Multimedia Mail Format Information," but is not an Internet
standard. However, mailcap files are supported on most Unix systems.
findmatch(caps, MIMEtype[, key[, filename[, plist]]])~
Return a 2-tuple; the first element is a string containing the command line to
be executed (which can be passed to os.system), and the second element
is the mailcap entry for a given MIME type. If no matching MIME type can be
found, ``(None, None)`` is returned.
{key} is the name of the field desired, which represents the type of activity to
be performed; the default value is 'view', since in the most common case you
simply want to view the body of the MIME-typed data. Other possible values
might be 'compose' and 'edit', if you wanted to create a new body of the given
MIME type or alter the existing body data. See 1524 for a complete list
of these fields.
{filename} is the filename to be substituted for ``%s`` in the command line; the
default value is ``'/dev/null'`` which is almost certainly not what you want, so
usually you'll override it by specifying a filename.
{plist} can be a list containing named parameters; the default value is simply
an empty list. Each entry in the list must be a string containing the parameter
name, an equals sign (``'='``), and the parameter's value. Mailcap entries can
contain named parameters like ``%{foo}``, which will be replaced by the value
of the parameter named 'foo'. For example, if the command line ``showpartial
%{id} %{number} %{total}`` was in a mailcap file, and {plist} was set to
``['id=1', 'number=2', 'total=3']``, the resulting command line would be
``'showpartial 1 2 3'``.
In a mailcap file, the "test" field can optionally be specified to test some
external condition (such as the machine architecture, or the window system in
use) to determine whether or not the mailcap line applies. findmatch
will automatically check such conditions and skip the entry if the check fails.
getcaps()~
Returns a dictionary mapping MIME types to a list of mailcap file entries. This
dictionary must be passed to the findmatch function. An entry is stored
as a list of dictionaries, but it shouldn't be necessary to know the details of
this representation.
The information is derived from all of the mailcap files found on the system.
Settings in the user's mailcap file $HOME/.mailcap will override
settings in the system mailcap files /etc/mailcap,
/usr/etc/mailcap, and /usr/local/etc/mailcap.
An example usage:: >
>>> import mailcap
>>> d=mailcap.getcaps()
>>> mailcap.findmatch(d, 'video/mpeg', filename='/tmp/tmp1223')
('xmpeg /tmp/tmp1223', {'view': 'xmpeg %s'})
==============================================================================
*py2stdlib-marshal*
marshal~
:synopsis: Convert Python objects to streams of bytes and back (with different
constraints).
This module contains functions that can read and write Python values in a binary
format. The format is specific to Python, but independent of machine
architecture issues (e.g., you can write a Python value to a file on a PC,
transport the file to a Sun, and read it back there). Details of the format are
undocumented on purpose; it may change between Python versions (although it
rarely does). [#]_
.. index::
module: pickle
module: shelve
object: code
This is not a general "persistence" module. For general persistence and
transfer of Python objects through RPC calls, see the modules pickle (|py2stdlib-pickle|) and
shelve (|py2stdlib-shelve|). The marshal (|py2stdlib-marshal|) module exists mainly to support reading and
writing the "pseudo-compiled" code for Python modules of .pyc files.
Therefore, the Python maintainers reserve the right to modify the marshal format
in backward incompatible ways should the need arise. If you're serializing and
de-serializing Python objects, use the pickle (|py2stdlib-pickle|) module instead -- the
performance is comparable, version independence is guaranteed, and pickle
supports a substantially wider range of objects than marshal.
.. warning::
The marshal (|py2stdlib-marshal|) module is not intended to be secure against erroneous or
maliciously constructed data. Never unmarshal data received from an
untrusted or unauthenticated source.
Not all Python object types are supported; in general, only objects whose value
is independent from a particular invocation of Python can be written and read by
this module. The following types are supported: booleans, integers, long
integers, floating point numbers, complex numbers, strings, Unicode objects,
tuples, lists, sets, frozensets, dictionaries, and code objects, where it should
be understood that tuples, lists, sets, frozensets and dictionaries are only
supported as long as the values contained therein are themselves supported; and
recursive lists, sets and dictionaries should not be written (they will cause
infinite loops). The singletons None, Ellipsis and
StopIteration can also be marshalled and unmarshalled.
.. warning::
On machines where C's ``long int`` type has more than 32 bits (such as the
DEC Alpha), it is possible to create plain Python integers that are longer
than 32 bits. If such an integer is marshaled and read back in on a machine
where C's ``long int`` type has only 32 bits, a Python long integer object
is returned instead. While of a different type, the numeric value is the
same. (This behavior is new in Python 2.2. In earlier versions, all but the
least-significant 32 bits of the value were lost, and a warning message was
printed.)
There are functions that read/write files as well as functions operating on
strings.
The module defines these functions:
dump(value, file[, version])~
Write the value on the open file. The value must be a supported type. The
file must be an open file object such as ``sys.stdout`` or returned by
open or os.popen. It must be opened in binary mode (``'wb'``
or ``'w+b'``).
If the value has (or contains an object that has) an unsupported type, a
ValueError exception is raised --- but garbage data will also be written
to the file. The object will not be properly read back by load.
.. versionadded:: 2.4
The {version} argument indicates the data format that ``dump`` should use
(see below).
load(file)~
Read one value from the open file and return it. If no valid value is read
(e.g. because the data has a different Python version's incompatible marshal
format), raise EOFError, ValueError or TypeError. The
file must be an open file object opened in binary mode (``'rb'`` or
``'r+b'``).
.. note:: >
If an object containing an unsupported type was marshalled with dump,
load will substitute ``None`` for the unmarshallable type.
<
dumps(value[, version])~
Return the string that would be written to a file by ``dump(value, file)``. The
value must be a supported type. Raise a ValueError exception if value
has (or contains an object that has) an unsupported type.
.. versionadded:: 2.4
The {version} argument indicates the data format that ``dumps`` should use
(see below).
loads(string)~
Convert the string to a value. If no valid value is found, raise
EOFError, ValueError or TypeError. Extra characters in the
string are ignored.
In addition, the following constants are defined:
version~
Indicates the format that the module uses. Version 0 is the historical format,
version 1 (added in Python 2.4) shares interned strings and version 2 (added in
Python 2.5) uses a binary format for floating point numbers. The current version
is 2.
.. versionadded:: 2.4
.. rubric:: Footnotes
.. [#] The name of this module stems from a bit of terminology used by the designers of
Modula-3 (amongst others), who use the term "marshalling" for shipping of data
around in a self-contained form. Strictly speaking, "to marshal" means to
convert some data from internal to external form (in an RPC buffer for instance)
and "unmarshalling" for the reverse process.
==============================================================================
*py2stdlib-math*
math~
:synopsis: Mathematical functions (sin() etc.).
This module is always available. It provides access to the mathematical
functions defined by the C standard.
These functions cannot be used with complex numbers; use the functions of the
same name from the cmath (|py2stdlib-cmath|) module if you require support for complex
numbers. The distinction between functions which support complex numbers and
those which don't is made since most users do not want to learn quite as much
mathematics as required to understand complex numbers. Receiving an exception
instead of a complex result allows earlier detection of the unexpected complex
number used as a parameter, so that the programmer can determine how and why it
was generated in the first place.
The following functions are provided by this module. Except when explicitly
noted otherwise, all return values are floats.
Number-theoretic and representation functions
---------------------------------------------
ceil(x)~
Return the ceiling of {x} as a float, the smallest integer value greater than or
equal to {x}.
copysign(x, y)~
Return {x} with the sign of {y}. On a platform that supports
signed zeros, ``copysign(1.0, -0.0)`` returns {-1.0}.
.. versionadded:: 2.6
fabs(x)~
Return the absolute value of {x}.
factorial(x)~
Return {x} factorial. Raises ValueError if {x} is not integral or
is negative.
.. versionadded:: 2.6
floor(x)~
Return the floor of {x} as a float, the largest integer value less than or equal
to {x}.
fmod(x, y)~
Return ``fmod(x, y)``, as defined by the platform C library. Note that the
Python expression ``x % y`` may not return the same result. The intent of the C
standard is that ``fmod(x, y)`` be exactly (mathematically; to infinite
precision) equal to ``x - n{y`` for some integer }n* such that the result has
the same sign as {x} and magnitude less than ``abs(y)``. Python's ``x % y``
returns a result with the sign of {y} instead, and may not be exactly computable
for float arguments. For example, ``fmod(-1e-100, 1e100)`` is ``-1e-100``, but
the result of Python's ``-1e-100 % 1e100`` is ``1e100-1e-100``, which cannot be
represented exactly as a float, and rounds to the surprising ``1e100``. For
this reason, function fmod is generally preferred when working with
floats, while Python's ``x % y`` is preferred when working with integers.
frexp(x)~
Return the mantissa and exponent of {x} as the pair ``(m, e)``. {m} is a float
and {e} is an integer such that ``x == m { 2}{e`` exactly. If }x* is zero,
returns ``(0.0, 0)``, otherwise ``0.5 <= abs(m) < 1``. This is used to "pick
apart" the internal representation of a float in a portable way.
fsum(iterable)~
Return an accurate floating point sum of values in the iterable. Avoids
loss of precision by tracking multiple intermediate partial sums:: >
>>> sum([.1, .1, .1, .1, .1, .1, .1, .1, .1, .1])
0.9999999999999999
>>> fsum([.1, .1, .1, .1, .1, .1, .1, .1, .1, .1])
1.0
<
The algorithm's accuracy depends on IEEE-754 arithmetic guarantees and the
typical case where the rounding mode is half-even. On some non-Windows
builds, the underlying C library uses extended precision addition and may
occasionally double-round an intermediate sum causing it to be off in its
least significant bit.
For further discussion and two alternative approaches, see the `ASPN cookbook
recipes for accurate floating point summation
<http://code.activestate.com/recipes/393090/>`_\.
.. versionadded:: 2.6
isinf(x)~
Check if the float {x} is positive or negative infinity.
.. versionadded:: 2.6
isnan(x)~
Check if the float {x} is a NaN (not a number). For more information
on NaNs, see the IEEE 754 standards.
.. versionadded:: 2.6
ldexp(x, i)~
Return ``x { (2}*i)``. This is essentially the inverse of function
frexp.
modf(x)~
Return the fractional and integer parts of {x}. Both results carry the sign
of {x} and are floats.
trunc(x)~
Return the Real value {x} truncated to an Integral (usually
a long integer). Uses the ``__trunc__`` method.
.. versionadded:: 2.6
Note that frexp and modf have a different call/return pattern
than their C equivalents: they take a single argument and return a pair of
values, rather than returning their second return value through an 'output
parameter' (there is no such thing in Python).
For the ceil, floor, and modf functions, note that {all}
floating-point numbers of sufficiently large magnitude are exact integers.
Python floats typically carry no more than 53 bits of precision (the same as the
platform C double type), in which case any float {x} with ``abs(x) >= 2{}52``
necessarily has no fractional bits.
Power and logarithmic functions
-------------------------------
exp(x)~
Return ``e{}x``.
expm1(x)~
Return ``e{x - 1``. For small floats }x*, the subtraction in
``exp(x) - 1`` can result in a significant loss of precision; the
expm1 function provides a way to compute this quantity to
full precision:: >
>>> from math import exp, expm1
>>> exp(1e-5) - 1 # gives result accurate to 11 places
1.0000050000069649e-05
>>> expm1(1e-5) # result accurate to full precision
1.0000050000166668e-05
<
.. versionadded:: 2.7
log(x[, base])~
With one argument, return the natural logarithm of {x} (to base {e}).
With two arguments, return the logarithm of {x} to the given {base},
calculated as ``log(x)/log(base)``.
.. versionchanged:: 2.3
{base} argument added.
log1p(x)~
Return the natural logarithm of {1+x} (base {e}). The
result is calculated in a way which is accurate for {x} near zero.
.. versionadded:: 2.6
log10(x)~
Return the base-10 logarithm of {x}. This is usually more accurate
than ``log(x, 10)``.
pow(x, y)~
Return ``x`` raised to the power ``y``. Exceptional cases follow
Annex 'F' of the C99 standard as far as possible. In particular,
``pow(1.0, x)`` and ``pow(x, 0.0)`` always return ``1.0``, even
when ``x`` is a zero or a NaN. If both ``x`` and ``y`` are finite,
``x`` is negative, and ``y`` is not an integer then ``pow(x, y)``
is undefined, and raises ValueError.
.. versionchanged:: 2.6
The outcome of ``1{nan`` and ``nan}*0`` was undefined.
sqrt(x)~
Return the square root of {x}.
Trigonometric functions
-----------------------
acos(x)~
Return the arc cosine of {x}, in radians.
asin(x)~
Return the arc sine of {x}, in radians.
atan(x)~
Return the arc tangent of {x}, in radians.
atan2(y, x)~
Return ``atan(y / x)``, in radians. The result is between ``-pi`` and ``pi``.
The vector in the plane from the origin to point ``(x, y)`` makes this angle
with the positive X axis. The point of atan2 is that the signs of both
inputs are known to it, so it can compute the correct quadrant for the angle.
For example, ``atan(1)`` and ``atan2(1, 1)`` are both ``pi/4``, but ``atan2(-1,
-1)`` is ``-3*pi/4``.
cos(x)~
Return the cosine of {x} radians.
hypot(x, y)~
Return the Euclidean norm, ``sqrt(x{x + y}y)``. This is the length of the vector
from the origin to point ``(x, y)``.
sin(x)~
Return the sine of {x} radians.
tan(x)~
Return the tangent of {x} radians.
Angular conversion
------------------
degrees(x)~
Converts angle {x} from radians to degrees.
radians(x)~
Converts angle {x} from degrees to radians.
Hyperbolic functions
--------------------
acosh(x)~
Return the inverse hyperbolic cosine of {x}.
.. versionadded:: 2.6
asinh(x)~
Return the inverse hyperbolic sine of {x}.
.. versionadded:: 2.6
atanh(x)~
Return the inverse hyperbolic tangent of {x}.
.. versionadded:: 2.6
cosh(x)~
Return the hyperbolic cosine of {x}.
sinh(x)~
Return the hyperbolic sine of {x}.
tanh(x)~
Return the hyperbolic tangent of {x}.
Special functions
-----------------
erf(x)~
Return the error function at {x}.
.. versionadded:: 2.7
erfc(x)~
Return the complementary error function at {x}.
.. versionadded:: 2.7
gamma(x)~
Return the Gamma function at {x}.
.. versionadded:: 2.7
lgamma(x)~
Return the natural logarithm of the absolute value of the Gamma
function at {x}.
.. versionadded:: 2.7
Constants
---------
pi~
The mathematical constant π = 3.141592..., to available precision.
e~
The mathematical constant e = 2.718281..., to available precision.
.. impl-detail::
The math (|py2stdlib-math|) module consists mostly of thin wrappers around the platform C
math library functions. Behavior in exceptional cases follows Annex F of
the C99 standard where appropriate. The current implementation will raise
ValueError for invalid operations like ``sqrt(-1.0)`` or ``log(0.0)``
(where C99 Annex F recommends signaling invalid operation or divide-by-zero),
and OverflowError for results that overflow (for example,
``exp(1000.0)``). A NaN will not be returned from any of the functions
above unless one or more of the input arguments was a NaN; in that case,
most functions will return a NaN, but (again following C99 Annex F) there
are some exceptions to this rule, for example ``pow(float('nan'), 0.0)`` or
``hypot(float('nan'), float('inf'))``.
Note that Python makes no effort to distinguish signaling NaNs from
quiet NaNs, and behavior for signaling NaNs remains unspecified.
Typical behavior is to treat all NaNs as though they were quiet.
.. versionchanged:: 2.6
Behavior in special cases now aims to follow C99 Annex F. In earlier
versions of Python the behavior in special cases was loosely specified.
.. seealso::
Module cmath (|py2stdlib-cmath|)
Complex number versions of many of these functions.
==============================================================================
*py2stdlib-md5*
md5~
:synopsis: RSA's MD5 message digest algorithm.
:deprecated:
2.5~
Use the hashlib (|py2stdlib-hashlib|) module instead.
.. index::
single: message digest, MD5
single: checksum; MD5
This module implements the interface to RSA's MD5 message digest algorithm (see
also Internet 1321). Its use is quite straightforward: use new (|py2stdlib-new|)
to create an md5 object. You can now feed this object with arbitrary strings
using the update method, and at any point you can ask it for the
digest (a strong kind of 128-bit checksum, a.k.a. "fingerprint") of the
concatenation of the strings fed to it so far using the digest method.
For example, to obtain the digest of the string ``'Nobody inspects the spammish
repetition'``:
>>> import md5
>>> m = md5.new()
>>> m.update("Nobody inspects")
>>> m.update(" the spammish repetition")
>>> m.digest()
'\xbbd\x9c\x83\xdd\x1e\xa5\xc9\xd9\xde\xc9\xa1\x8d\xf0\xff\xe9'
More condensed:
>>> md5.new("Nobody inspects the spammish repetition").digest()
'\xbbd\x9c\x83\xdd\x1e\xa5\xc9\xd9\xde\xc9\xa1\x8d\xf0\xff\xe9'
The following values are provided as constants in the module and as attributes
of the md5 objects returned by new (|py2stdlib-new|):
digest_size~
The size of the resulting digest in bytes. This is always ``16``.
The md5 module provides the following functions:
new([arg])~
Return a new md5 object. If {arg} is present, the method call ``update(arg)``
is made.
md5([arg])~
For backward compatibility reasons, this is an alternative name for the
new (|py2stdlib-new|) function.
An md5 object has the following methods:
md5.update(arg)~
Update the md5 object with the string {arg}. Repeated calls are equivalent to a
single call with the concatenation of all the arguments: ``m.update(a);
m.update(b)`` is equivalent to ``m.update(a+b)``.
md5.digest()~
Return the digest of the strings passed to the update method so far.
This is a 16-byte string which may contain non-ASCII characters, including null
bytes.
md5.hexdigest()~
Like digest except the digest is returned as a string of length 32,
containing only hexadecimal digits. This may be used to exchange the value
safely in email or other non-binary environments.
md5.copy()~
Return a copy ("clone") of the md5 object. This can be used to efficiently
compute the digests of strings that share a common initial substring.
.. seealso::
Module sha (|py2stdlib-sha|)
Similar module implementing the Secure Hash Algorithm (SHA). The SHA algorithm
is considered a more secure hash.
==============================================================================
*py2stdlib-mhlib*
mhlib~
:synopsis: Manipulate MH mailboxes from Python.
:deprecated:
2.6~
The mhlib (|py2stdlib-mhlib|) module has been removed in Python 3.0. Use the
mailbox (|py2stdlib-mailbox|) instead.
The mhlib (|py2stdlib-mhlib|) module provides a Python interface to MH folders and their
contents.
The module contains three basic classes, MH, which represents a
particular collection of folders, Folder, which represents a single
folder, and Message, which represents a single message.
MH([path[, profile]])~
MH represents a collection of MH folders.
Folder(mh, name)~
The Folder class represents a single folder and its messages.
Message(folder, number[, name])~
Message objects represent individual messages in a folder. The Message
class is derived from mimetools.Message.
MH Objects
----------
MH instances have the following methods:
MH.error(format[, ...])~
Print an error message -- can be overridden.
MH.getprofile(key)~
Return a profile entry (``None`` if not set).
MH.getpath()~
Return the mailbox pathname.
MH.getcontext()~
Return the current folder name.
MH.setcontext(name)~
Set the current folder name.
MH.listfolders()~
Return a list of top-level folders.
MH.listallfolders()~
Return a list of all folders.
MH.listsubfolders(name)~
Return a list of direct subfolders of the given folder.
MH.listallsubfolders(name)~
Return a list of all subfolders of the given folder.
MH.makefolder(name)~
Create a new folder.
MH.deletefolder(name)~
Delete a folder -- must have no subfolders.
MH.openfolder(name)~
Return a new open folder object.
Folder Objects
--------------
Folder instances represent open folders and have the following methods:
Folder.error(format[, ...])~
Print an error message -- can be overridden.
Folder.getfullname()~
Return the folder's full pathname.
Folder.getsequencesfilename()~
Return the full pathname of the folder's sequences file.
Folder.getmessagefilename(n)~
Return the full pathname of message {n} of the folder.
Folder.listmessages()~
Return a list of messages in the folder (as numbers).
Folder.getcurrent()~
Return the current message number.
Folder.setcurrent(n)~
Set the current message number to {n}.
Folder.parsesequence(seq)~
Parse msgs syntax into list of messages.
Folder.getlast()~
Get last message, or ``0`` if no messages are in the folder.
Folder.setlast(n)~
Set last message (internal use only).
Folder.getsequences()~
Return dictionary of sequences in folder. The sequence names are used as keys,
and the values are the lists of message numbers in the sequences.
Folder.putsequences(dict)~
Return dictionary of sequences in folder name: list.
Folder.removemessages(list)~
Remove messages in list from folder.
Folder.refilemessages(list, tofolder)~
Move messages in list to other folder.
Folder.movemessage(n, tofolder, ton)~
Move one message to a given destination in another folder.
Folder.copymessage(n, tofolder, ton)~
Copy one message to a given destination in another folder.
Message Objects
---------------
The Message class adds one method to those of
mimetools.Message:
Message.openmessage(n)~
Return a new open message object (costs a file descriptor).
==============================================================================
*py2stdlib-mimetools*
mimetools~
:synopsis: Tools for parsing MIME-style message bodies.
:deprecated:
2.3~
The email (|py2stdlib-email|) package should be used in preference to the mimetools (|py2stdlib-mimetools|)
module. This module is present only to maintain backward compatibility, and
it has been removed in 3.x.
.. index:: module: rfc822
This module defines a subclass of the rfc822 (|py2stdlib-rfc822|) module's Message
class and a number of utility functions that are useful for the manipulation for
MIME multipart or encoded message.
It defines the following items:
Message(fp[, seekable])~
Return a new instance of the Message class. This is a subclass of the
rfc822.Message class, with some additional methods (see below). The
{seekable} argument has the same meaning as for rfc822.Message.
choose_boundary()~
Return a unique string that has a high likelihood of being usable as a part
boundary. The string has the form ``'hostipaddr.uid.pid.timestamp.random'``.
decode(input, output, encoding)~
Read data encoded using the allowed MIME {encoding} from open file object
{input} and write the decoded data to open file object {output}. Valid values
for {encoding} include ``'base64'``, ``'quoted-printable'``, ``'uuencode'``,
``'x-uuencode'``, ``'uue'``, ``'x-uue'``, ``'7bit'``, and ``'8bit'``. Decoding
messages encoded in ``'7bit'`` or ``'8bit'`` has no effect. The input is simply
copied to the output.
encode(input, output, encoding)~
Read data from open file object {input} and write it encoded using the allowed
MIME {encoding} to open file object {output}. Valid values for {encoding} are
the same as for decode.
copyliteral(input, output)~
Read lines from open file {input} until EOF and write them to open file
{output}.
copybinary(input, output)~
Read blocks until EOF from open file {input} and write them to open file
{output}. The block size is currently fixed at 8192.
.. seealso::
Module email (|py2stdlib-email|)
Comprehensive email handling package; supersedes the mimetools (|py2stdlib-mimetools|) module.
Module rfc822 (|py2stdlib-rfc822|)
Provides the base class for mimetools.Message.
Module multifile (|py2stdlib-multifile|)
Support for reading files which contain distinct parts, such as MIME data.
http://faqs.cs.uu.nl/na-dir/mail/mime-faq/.html
The MIME Frequently Asked Questions document. For an overview of MIME, see the
answer to question 1.1 in Part 1 of this document.
Additional Methods of Message Objects
-------------------------------------
The Message class defines the following methods in addition to the
rfc822.Message methods:
Message.getplist()~
Return the parameter list of the Content-Type header. This is a
list of strings. For parameters of the form ``key=value``, {key} is converted
to lower case but {value} is not. For example, if the message contains the
header ``Content-type: text/html; spam=1; Spam=2; Spam`` then getplist
will return the Python list ``['spam=1', 'spam=2', 'Spam']``.
Message.getparam(name)~
Return the {value} of the first parameter (as returned by getplist) of
the form ``name=value`` for the given {name}. If {value} is surrounded by
quotes of the form '``<``...\ ``>``' or '``"``...\ ``"``', these are removed.
Message.getencoding()~
Return the encoding specified in the Content-Transfer-Encoding
message header. If no such header exists, return ``'7bit'``. The encoding is
converted to lower case.
Message.gettype()~
Return the message type (of the form ``type/subtype``) as specified in the
Content-Type header. If no such header exists, return
``'text/plain'``. The type is converted to lower case.
Message.getmaintype()~
Return the main type as specified in the Content-Type header. If
no such header exists, return ``'text'``. The main type is converted to lower
case.
Message.getsubtype()~
Return the subtype as specified in the Content-Type header. If no
such header exists, return ``'plain'``. The subtype is converted to lower case.
==============================================================================
*py2stdlib-mimetypes*
mimetypes~
:synopsis: Mapping of filename extensions to MIME types.
.. index:: pair: MIME; content type
The mimetypes (|py2stdlib-mimetypes|) module converts between a filename or URL and the MIME type
associated with the filename extension. Conversions are provided from filename
to MIME type and from MIME type to filename extension; encodings are not
supported for the latter conversion.
The module provides one class and a number of convenience functions. The
functions are the normal interface to this module, but some applications may be
interested in the class as well.
The functions described below provide the primary interface for this module. If
the module has not been initialized, they will call init if they rely on
the information init sets up.
guess_type(filename[, strict])~
.. index:: pair: MIME; headers
Guess the type of a file based on its filename or URL, given by {filename}. The
return value is a tuple ``(type, encoding)`` where {type} is ``None`` if the
type can't be guessed (missing or unknown suffix) or a string of the form
``'type/subtype'``, usable for a MIME content-type header.
{encoding} is ``None`` for no encoding or the name of the program used to encode
(e.g. compress or gzip (|py2stdlib-gzip|)). The encoding is suitable for use
as a Content-Encoding header, {not} as a
Content-Transfer-Encoding header. The mappings are table driven.
Encoding suffixes are case sensitive; type suffixes are first tried case
sensitively, then case insensitively.
Optional {strict} is a flag specifying whether the list of known MIME types
is limited to only the official types `registered with IANA
<http://www.iana.org/assignments/media-types/>`_ are recognized.
When {strict} is true (the default), only the IANA types are supported; when
{strict} is false, some additional non-standard but commonly used MIME types
are also recognized.
guess_all_extensions(type[, strict])~
Guess the extensions for a file based on its MIME type, given by {type}. The
return value is a list of strings giving all possible filename extensions,
including the leading dot (``'.'``). The extensions are not guaranteed to have
been associated with any particular data stream, but would be mapped to the MIME
type {type} by guess_type.
Optional {strict} has the same meaning as with the guess_type function.
guess_extension(type[, strict])~
Guess the extension for a file based on its MIME type, given by {type}. The
return value is a string giving a filename extension, including the leading dot
(``'.'``). The extension is not guaranteed to have been associated with any
particular data stream, but would be mapped to the MIME type {type} by
guess_type. If no extension can be guessed for {type}, ``None`` is
returned.
Optional {strict} has the same meaning as with the guess_type function.
Some additional functions and data items are available for controlling the
behavior of the module.
init([files])~
Initialize the internal data structures. If given, {files} must be a sequence
of file names which should be used to augment the default type map. If omitted,
the file names to use are taken from knownfiles; on Windows, the
current registry settings are loaded. Each file named in {files} or
knownfiles takes precedence over those named before it. Calling
init repeatedly is allowed.
.. versionchanged:: 2.7
Previously, Windows registry settings were ignored.
read_mime_types(filename)~
Load the type map given in the file {filename}, if it exists. The type map is
returned as a dictionary mapping filename extensions, including the leading dot
(``'.'``), to strings of the form ``'type/subtype'``. If the file {filename}
does not exist or cannot be read, ``None`` is returned.
add_type(type, ext[, strict])~
Add a mapping from the mimetype {type} to the extension {ext}. When the
extension is already known, the new type will replace the old one. When the type
is already known the extension will be added to the list of known extensions.
When {strict} is True (the default), the mapping will added to the official MIME
types, otherwise to the non-standard ones.
inited~
Flag indicating whether or not the global data structures have been initialized.
This is set to true by init.
knownfiles~
.. index:: single: file; mime.types
List of type map file names commonly installed. These files are typically named
mime.types and are installed in different locations by different
packages.
suffix_map~
Dictionary mapping suffixes to suffixes. This is used to allow recognition of
encoded files for which the encoding and the type are indicated by the same
extension. For example, the .tgz extension is mapped to .tar.gz
to allow the encoding and type to be recognized separately.
encodings_map~
Dictionary mapping filename extensions to encoding types.
types_map~
Dictionary mapping filename extensions to MIME types.
common_types~
Dictionary mapping filename extensions to non-standard, but commonly found MIME
types.
The MimeTypes class may be useful for applications which may want more
than one MIME-type database:
MimeTypes([filenames])~
This class represents a MIME-types database. By default, it provides access to
the same database as the rest of this module. The initial database is a copy of
that provided by the module, and may be extended by loading additional
mime.types\ -style files into the database using the read or
readfp methods. The mapping dictionaries may also be cleared before
loading additional data if the default data is not desired.
The optional {filenames} parameter can be used to cause additional files to be
loaded "on top" of the default database.
.. versionadded:: 2.2
An example usage of the module:: >
>>> import mimetypes
>>> mimetypes.init()
>>> mimetypes.knownfiles
['/etc/mime.types', '/etc/httpd/mime.types', ... ]
>>> mimetypes.suffix_map['.tgz']
'.tar.gz'
>>> mimetypes.encodings_map['.gz']
'gzip'
>>> mimetypes.types_map['.tgz']
'application/x-tar-gz'
<
MimeTypes Objects
MimeTypes instances provide an interface which is very like that of the
mimetypes (|py2stdlib-mimetypes|) module.
MimeTypes.suffix_map~
Dictionary mapping suffixes to suffixes. This is used to allow recognition of
encoded files for which the encoding and the type are indicated by the same
extension. For example, the .tgz extension is mapped to .tar.gz
to allow the encoding and type to be recognized separately. This is initially a
copy of the global ``suffix_map`` defined in the module.
MimeTypes.encodings_map~
Dictionary mapping filename extensions to encoding types. This is initially a
copy of the global ``encodings_map`` defined in the module.
MimeTypes.types_map~
Dictionary mapping filename extensions to MIME types. This is initially a copy
of the global ``types_map`` defined in the module.
MimeTypes.common_types~
Dictionary mapping filename extensions to non-standard, but commonly found MIME
types. This is initially a copy of the global ``common_types`` defined in the
module.
MimeTypes.guess_extension(type[, strict])~
Similar to the guess_extension function, using the tables stored as part
of the object.
MimeTypes.guess_all_extensions(type[, strict])~
Similar to the guess_all_extensions function, using the tables stored as part
of the object.
MimeTypes.guess_type(url[, strict])~
Similar to the guess_type function, using the tables stored as part of
the object.
MimeTypes.read(path)~
Load MIME information from a file named {path}. This uses readfp to
parse the file.
MimeTypes.readfp(file)~
Load MIME type information from an open file. The file must have the format of
the standard mime.types files.
MimeTypes.read_windows_registry()~
Load MIME type information from the Windows registry. Availability: Windows.
.. versionadded:: 2.7
==============================================================================
*py2stdlib-mimewriter*
MimeWriter~
:synopsis: Write MIME format files.
:deprecated:
2.3~
The email (|py2stdlib-email|) package should be used in preference to the MimeWriter (|py2stdlib-mimewriter|)
module. This module is present only to maintain backward compatibility.
This module defines the class MimeWriter (|py2stdlib-mimewriter|). The MimeWriter (|py2stdlib-mimewriter|)
class implements a basic formatter for creating MIME multi-part files. It
doesn't seek around the output file nor does it use large amounts of buffer
space. You must write the parts out in the order that they should occur in the
final file. MimeWriter (|py2stdlib-mimewriter|) does buffer the headers you add, allowing you
to rearrange their order.
MimeWriter(fp)~
Return a new instance of the MimeWriter (|py2stdlib-mimewriter|) class. The only argument
passed, {fp}, is a file object to be used for writing. Note that a
StringIO (|py2stdlib-stringio|) object could also be used.
MimeWriter Objects
------------------
MimeWriter (|py2stdlib-mimewriter|) instances have the following methods:
MimeWriter.addheader(key, value[, prefix])~
Add a header line to the MIME message. The {key} is the name of the header,
where the {value} obviously provides the value of the header. The optional
argument {prefix} determines where the header is inserted; ``0`` means append
at the end, ``1`` is insert at the start. The default is to append.
MimeWriter.flushheaders()~
Causes all headers accumulated so far to be written out (and forgotten). This is
useful if you don't need a body part at all, e.g. for a subpart of type
message/rfc822 that's (mis)used to store some header-like
information.
MimeWriter.startbody(ctype[, plist[, prefix]])~
Returns a file-like object which can be used to write to the body of the
message. The content-type is set to the provided {ctype}, and the optional
parameter {plist} provides additional parameters for the content-type
declaration. {prefix} functions as in addheader except that the default
is to insert at the start.
MimeWriter.startmultipartbody(subtype[, boundary[, plist[, prefix]]])~
Returns a file-like object which can be used to write to the body of the
message. Additionally, this method initializes the multi-part code, where
{subtype} provides the multipart subtype, {boundary} may provide a user-defined
boundary specification, and {plist} provides optional parameters for the
subtype. {prefix} functions as in startbody. Subparts should be created
using nextpart.
MimeWriter.nextpart()~
Returns a new instance of MimeWriter (|py2stdlib-mimewriter|) which represents an individual
part in a multipart message. This may be used to write the part as well as
used for creating recursively complex multipart messages. The message must first
be initialized with startmultipartbody before using nextpart.
MimeWriter.lastpart()~
This is used to designate the last part of a multipart message, and should
{always} be used when writing multipart messages.
==============================================================================
*py2stdlib-mimify*
mimify~
:synopsis: Mimification and unmimification of mail messages.
:deprecated:
2.3~
The email (|py2stdlib-email|) package should be used in preference to the mimify (|py2stdlib-mimify|)
module. This module is present only to maintain backward compatibility.
The mimify (|py2stdlib-mimify|) module defines two functions to convert mail messages to and
from MIME format. The mail message can be either a simple message or a
so-called multipart message. Each part is treated separately. Mimifying (a part
of) a message entails encoding the message as quoted-printable if it contains
any characters that cannot be represented using 7-bit ASCII. Unmimifying (a
part of) a message entails undoing the quoted-printable encoding. Mimify and
unmimify are especially useful when a message has to be edited before being
sent. Typical use would be:: >
unmimify message
edit message
mimify message
send message
<
The modules defines the following user-callable functions and user-settable
variables:
mimify(infile, outfile)~
Copy the message in {infile} to {outfile}, converting parts to quoted-printable
and adding MIME mail headers when necessary. {infile} and {outfile} can be file
objects (actually, any object that has a readline (|py2stdlib-readline|) method (for {infile})
or a write method (for {outfile})) or strings naming the files. If
{infile} and {outfile} are both strings, they may have the same value.
unmimify(infile, outfile[, decode_base64])~
Copy the message in {infile} to {outfile}, decoding all quoted-printable parts.
{infile} and {outfile} can be file objects (actually, any object that has a
readline (|py2stdlib-readline|) method (for {infile}) or a write method (for
{outfile})) or strings naming the files. If {infile} and {outfile} are both
strings, they may have the same value. If the {decode_base64} argument is
provided and tests true, any parts that are coded in the base64 encoding are
decoded as well.
mime_decode_header(line)~
Return a decoded version of the encoded header line in {line}. This only
supports the ISO 8859-1 charset (Latin-1).
mime_encode_header(line)~
Return a MIME-encoded version of the header line in {line}.
MAXLEN~
By default, a part will be encoded as quoted-printable when it contains any
non-ASCII characters (characters with the 8th bit set), or if there are any
lines longer than MAXLEN characters (default value 200).
CHARSET~
When not specified in the mail headers, a character set must be filled in. The
string used is stored in CHARSET, and the default value is ISO-8859-1
(also known as Latin1 (latin-one)).
This module can also be used from the command line. Usage is as follows:: >
mimify.py -e [-l length] [infile [outfile]]
mimify.py -d [-b] [infile [outfile]]
<
to encode (mimify) and decode (unmimify) respectively. {infile} defaults to
standard input, {outfile} defaults to standard output. The same file can be
specified for input and output.
If the {-l}* option is given when encoding, if there are any lines longer than
the specified {length}, the containing part will be encoded.
If the {-b}* option is given when decoding, any base64 parts will be decoded as
well.
.. seealso::
Module quopri (|py2stdlib-quopri|)
Encode and decode MIME quoted-printable files.
==============================================================================
*py2stdlib-miniaeframe*
MiniAEFrame~
:platform: Mac
:synopsis: Support to act as an Open Scripting Architecture (OSA) server ("Apple Events").
.. index::
single: Open Scripting Architecture
single: AppleEvents
module: FrameWork
The module MiniAEFrame (|py2stdlib-miniaeframe|) provides a framework for an application that can
function as an Open Scripting Architecture (OSA) server, i.e. receive and
process AppleEvents. It can be used in conjunction with FrameWork (|py2stdlib-framework|) or
standalone. As an example, it is used in PythonCGISlave.
The MiniAEFrame (|py2stdlib-miniaeframe|) module defines the following classes:
AEServer()~
A class that handles AppleEvent dispatch. Your application should subclass this
class together with either MiniApplication or
FrameWork.Application. Your __init__ method should call the
__init__ method for both classes.
MiniApplication()~
A class that is more or less compatible with FrameWork.Application but
with less functionality. Its event loop supports the apple menu, command-dot and
AppleEvents; other events are passed on to the Python interpreter and/or Sioux.
Useful if your application wants to use AEServer but does not provide
its own windows, etc.
AEServer Objects
----------------
AEServer.installaehandler(classe, type, callback)~
Installs an AppleEvent handler. {classe} and {type} are the four-character OSA
Class and Type designators, ``'{}'`` wildcards are allowed. When a matching
AppleEvent is received the parameters are decoded and your callback is invoked.
AEServer.callback(_object, {}kwargs)~
Your callback is called with the OSA Direct Object as first positional
parameter. The other parameters are passed as keyword arguments, with the
4-character designator as name. Three extra keyword parameters are passed:
``_class`` and ``_type`` are the Class and Type designators and ``_attributes``
is a dictionary with the AppleEvent attributes.
The return value of your method is packed with aetools.packevent and
sent as reply.
Note that there are some serious problems with the current design. AppleEvents
which have non-identifier 4-character designators for arguments are not
implementable, and it is not possible to return an error to the originator. This
will be addressed in a future release.
==============================================================================
*py2stdlib-mmap*
mmap~
:synopsis: Interface to memory-mapped files for Unix and Windows.
Memory-mapped file objects behave like both strings and like file objects.
Unlike normal string objects, however, these are mutable. You can use mmap
objects in most places where strings are expected; for example, you can use
the re (|py2stdlib-re|) module to search through a memory-mapped file. Since they're
mutable, you can change a single character by doing ``obj[index] = 'a'``, or
change a substring by assigning to a slice: ``obj[i1:i2] = '...'``. You can
also read and write data starting at the current file position, and
seek through the file to different positions.
A memory-mapped file is created by the mmap (|py2stdlib-mmap|) constructor, which is
different on Unix and on Windows. In either case you must provide a file
descriptor for a file opened for update. If you wish to map an existing Python
file object, use its fileno method to obtain the correct value for the
{fileno} parameter. Otherwise, you can open the file using the
os.open function, which returns a file descriptor directly (the file
still needs to be closed when done).
For both the Unix and Windows versions of the constructor, {access} may be
specified as an optional keyword parameter. {access} accepts one of three
values: ACCESS_READ, ACCESS_WRITE, or ACCESS_COPY
to specify read-only, write-through or copy-on-write memory respectively.
{access} can be used on both Unix and Windows. If {access} is not specified,
Windows mmap returns a write-through mapping. The initial memory values for
all three access types are taken from the specified file. Assignment to an
ACCESS_READ memory map raises a TypeError exception.
Assignment to an ACCESS_WRITE memory map affects both memory and the
underlying file. Assignment to an ACCESS_COPY memory map affects
memory but does not update the underlying file.
.. versionchanged:: 2.5
To map anonymous memory, -1 should be passed as the fileno along with the
length.
.. versionchanged:: 2.6
mmap.mmap has formerly been a factory function creating mmap objects. Now
mmap.mmap is the class itself.
mmap(fileno, length[, tagname[, access[, offset]]])~
{(Windows version)}{ Maps }length* bytes from the file specified by the
file handle {fileno}, and creates a mmap object. If {length} is larger
than the current size of the file, the file is extended to contain {length}
bytes. If {length} is ``0``, the maximum length of the map is the current
size of the file, except that if the file is empty Windows raises an
exception (you cannot create an empty mapping on Windows).
{tagname}, if specified and not ``None``, is a string giving a tag name for
the mapping. Windows allows you to have many different mappings against
the same file. If you specify the name of an existing tag, that tag is
opened, otherwise a new tag of this name is created. If this parameter is
omitted or ``None``, the mapping is created without a name. Avoiding the
use of the tag parameter will assist in keeping your code portable between
Unix and Windows.
{offset} may be specified as a non-negative integer offset. mmap references
will be relative to the offset from the beginning of the file. {offset}
defaults to 0. {offset} must be a multiple of the ALLOCATIONGRANULARITY.
mmap(fileno, length[, flags[, prot[, access[, offset]]]])~
{(Unix version)}{ Maps }length* bytes from the file specified by the file
descriptor {fileno}, and returns a mmap object. If {length} is ``0``, the
maximum length of the map will be the current size of the file when
mmap (|py2stdlib-mmap|) is called.
{flags} specifies the nature of the mapping. MAP_PRIVATE creates a
private copy-on-write mapping, so changes to the contents of the mmap
object will be private to this process, and MAP_SHARED creates a
mapping that's shared with all other processes mapping the same areas of
the file. The default value is MAP_SHARED.
{prot}, if specified, gives the desired memory protection; the two most
useful values are PROT_READ and PROT_WRITE, to specify
that the pages may be read or written. {prot} defaults to
PROT_READ \| PROT_WRITE.
{access} may be specified in lieu of {flags} and {prot} as an optional
keyword parameter. It is an error to specify both {flags}, {prot} and
{access}. See the description of {access} above for information on how to
use this parameter.
{offset} may be specified as a non-negative integer offset. mmap references
will be relative to the offset from the beginning of the file. {offset}
defaults to 0. {offset} must be a multiple of the PAGESIZE or
ALLOCATIONGRANULARITY.
This example shows a simple way of using mmap (|py2stdlib-mmap|):: >
import mmap
# write a simple example file
with open("hello.txt", "wb") as f:
f.write("Hello Python!\n")
with open("hello.txt", "r+b") as f:
# memory-map the file, size 0 means whole file
map = mmap.mmap(f.fileno(), 0)
# read content via standard file methods
print map.readline() # prints "Hello Python!"
# read content via slice notation
print map[:5] # prints "Hello"
# update content using slice notation;
# note that new content must have same size
map[6:] = " world!\n"
# ... and read again using standard file methods
map.seek(0)
print map.readline() # prints "Hello world!"
# close the map
map.close()
<
The next example demonstrates how to create an anonymous map and exchange
data between the parent and child processes:: >
import mmap
import os
map = mmap.mmap(-1, 13)
map.write("Hello world!")
pid = os.fork()
if pid == 0: # In a child process
map.seek(0)
print map.readline()
map.close()
<
Memory-mapped file objects support the following methods:
close()~
Close the file. Subsequent calls to other methods of the object will
result in an exception being raised.
find(string[, start[, end]])~
Returns the lowest index in the object where the substring {string} is
found, such that {string} is contained in the range [{start}, {end}].
Optional arguments {start} and {end} are interpreted as in slice notation.
Returns ``-1`` on failure.
flush([offset, size])~
Flushes changes made to the in-memory copy of a file back to disk. Without
use of this call there is no guarantee that changes are written back before
the object is destroyed. If {offset} and {size} are specified, only
changes to the given range of bytes will be flushed to disk; otherwise, the
whole extent of the mapping is flushed.
{(Windows version)}* A nonzero value returned indicates success; zero
indicates failure.
{(Unix version)}* A zero value is returned to indicate success. An
exception is raised when the call failed.
move(dest, src, count)~
Copy the {count} bytes starting at offset {src} to the destination index
{dest}. If the mmap was created with ACCESS_READ, then calls to
move will throw a TypeError exception.
read(num)~
Return a string containing up to {num} bytes starting from the current
file position; the file position is updated to point after the bytes that
were returned.
read_byte()~
Returns a string of length 1 containing the character at the current file
position, and advances the file position by 1.
readline()~
Returns a single line, starting at the current file position and up to the
next newline.
resize(newsize)~
Resizes the map and the underlying file, if any. If the mmap was created
with ACCESS_READ or ACCESS_COPY, resizing the map will
throw a TypeError exception.
rfind(string[, start[, end]])~
Returns the highest index in the object where the substring {string} is
found, such that {string} is contained in the range [{start}, {end}].
Optional arguments {start} and {end} are interpreted as in slice notation.
Returns ``-1`` on failure.
seek(pos[, whence])~
Set the file's current position. {whence} argument is optional and
defaults to ``os.SEEK_SET`` or ``0`` (absolute file positioning); other
values are ``os.SEEK_CUR`` or ``1`` (seek relative to the current
position) and ``os.SEEK_END`` or ``2`` (seek relative to the file's end).
size()~
Return the length of the file, which can be larger than the size of the
memory-mapped area.
tell()~
Returns the current position of the file pointer.
write(string)~
Write the bytes in {string} into memory at the current position of the
file pointer; the file position is updated to point after the bytes that
were written. If the mmap was created with ACCESS_READ, then
writing to it will throw a TypeError exception.
write_byte(byte)~
Write the single-character string {byte} into memory at the current
position of the file pointer; the file position is advanced by ``1``. If
the mmap was created with ACCESS_READ, then writing to it will
throw a TypeError exception.
==============================================================================
*py2stdlib-modulefinder*
modulefinder~
:synopsis: Find modules used by a script.
.. versionadded:: 2.3
This module provides a ModuleFinder class that can be used to determine
the set of modules imported by a script. ``modulefinder.py`` can also be run as
a script, giving the filename of a Python script as its argument, after which a
report of the imported modules will be printed.
AddPackagePath(pkg_name, path)~
Record that the package named {pkg_name} can be found in the specified {path}.
ReplacePackage(oldname, newname)~
Allows specifying that the module named {oldname} is in fact the package named
{newname}. The most common usage would be to handle how the _xmlplus
package replaces the xml package.
ModuleFinder([path=None, debug=0, excludes=[], replace_paths=[]])~
This class provides run_script and report methods to determine
the set of modules imported by a script. {path} can be a list of directories to
search for modules; if not specified, ``sys.path`` is used. {debug} sets the
debugging level; higher values make the class print debugging messages about
what it's doing. {excludes} is a list of module names to exclude from the
analysis. {replace_paths} is a list of ``(oldpath, newpath)`` tuples that will
be replaced in module paths.
report()~
Print a report to standard output that lists the modules imported by the
script and their paths, as well as modules that are missing or seem to be
missing.
run_script(pathname)~
Analyze the contents of the {pathname} file, which must contain Python
code.
modules~
A dictionary mapping module names to modules. See
modulefinder-example
Example usage of ModuleFinder
--------------------------------------
The script that is going to get analyzed later on (bacon.py):: >
import re, itertools
try:
import baconhameggs
except ImportError:
pass
try:
import guido.python.ham
except ImportError:
pass
<
The script that will output the report of bacon.py::
from modulefinder import ModuleFinder
finder = ModuleFinder()
finder.run_script('bacon.py')
print 'Loaded modules:'
for name, mod in finder.modules.iteritems():
print '%s: ' % name,
print ','.join(mod.globalnames.keys()[:3])
print '-'*50
print 'Modules not imported:'
print '\n'.join(finder.badmodules.iterkeys())
Sample output (may vary depending on the architecture):: >
Loaded modules:
_types:
copy_reg: _inverted_registry,_slotnames,__all__
sre_compile: isstring,_sre,_optimize_unicode
_sre:
sre_constants: REPEAT_ONE,makedict,AT_END_LINE
sys:
re: __module__,finditer,_expand
itertools:
__main__: re,itertools,baconhameggs
sre_parse: __getslice__,_PATTERNENDERS,SRE_FLAG_UNICODE
array:
types: __module__,IntType,TypeType
Modules not imported:
guido.python.ham
baconhameggs
==============================================================================
*py2stdlib-msilib*
msilib~
:platform: Windows
:synopsis: Creation of Microsoft Installer files, and CAB files.
.. index:: single: msi
.. versionadded:: 2.5
The msilib (|py2stdlib-msilib|) supports the creation of Microsoft Installer (``.msi``) files.
Because these files often contain an embedded "cabinet" file (``.cab``), it also
exposes an API to create CAB files. Support for reading ``.cab`` files is
currently not implemented; read support for the ``.msi`` database is possible.
This package aims to provide complete access to all tables in an ``.msi`` file,
therefore, it is a fairly low-level API. Two primary applications of this
package are the distutils (|py2stdlib-distutils|) command ``bdist_msi``, and the creation of
Python installer package itself (although that currently uses a different
version of ``msilib``).
The package contents can be roughly split into four parts: low-level CAB
routines, low-level MSI routines, higher-level MSI routines, and standard table
structures.
FCICreate(cabname, files)~
Create a new CAB file named {cabname}. {files} must be a list of tuples, each
containing the name of the file on disk, and the name of the file inside the CAB
file.
The files are added to the CAB file in the order they appear in the list. All
files are added into a single CAB file, using the MSZIP compression algorithm.
Callbacks to Python for the various steps of MSI creation are currently not
exposed.
UuidCreate()~
Return the string representation of a new unique identifier. This wraps the
Windows API functions UuidCreate and UuidToString.
OpenDatabase(path, persist)~
Return a new database object by calling MsiOpenDatabase. {path} is the file
name of the MSI file; {persist} can be one of the constants
``MSIDBOPEN_CREATEDIRECT``, ``MSIDBOPEN_CREATE``, ``MSIDBOPEN_DIRECT``,
``MSIDBOPEN_READONLY``, or ``MSIDBOPEN_TRANSACT``, and may include the flag
``MSIDBOPEN_PATCHFILE``. See the Microsoft documentation for the meaning of
these flags; depending on the flags, an existing database is opened, or a new
one created.
CreateRecord(count)~
Return a new record object by calling MSICreateRecord. {count} is the
number of fields of the record.
init_database(name, schema, ProductName, ProductCode, ProductVersion, Manufacturer)~
Create and return a new database {name}, initialize it with {schema}, and set
the properties {ProductName}, {ProductCode}, {ProductVersion}, and
{Manufacturer}.
{schema} must be a module object containing ``tables`` and
``_Validation_records`` attributes; typically, msilib.schema should be
used.
The database will contain just the schema and the validation records when this
function returns.
add_data(database, table, records)~
Add all {records} to the table named {table} in {database}.
The {table} argument must be one of the predefined tables in the MSI schema,
e.g. ``'Feature'``, ``'File'``, ``'Component'``, ``'Dialog'``, ``'Control'``,
etc.
{records} should be a list of tuples, each one containing all fields of a
record according to the schema of the table. For optional fields,
``None`` can be passed.
Field values can be int or long numbers, strings, or instances of the Binary
class.
Binary(filename)~
Represents entries in the Binary table; inserting such an object using
add_data reads the file named {filename} into the table.
add_tables(database, module)~
Add all table content from {module} to {database}. {module} must contain an
attribute {tables} listing all tables for which content should be added, and one
attribute per table that has the actual content.
This is typically used to install the sequence tables.
add_stream(database, name, path)~
Add the file {path} into the ``_Stream`` table of {database}, with the stream
name {name}.
gen_uuid()~
Return a new UUID, in the format that MSI typically requires (i.e. in curly
braces, and with all hexdigits in upper-case).
.. seealso::
`FCICreateFile <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/devnotes/winprog/fcicreate.asp>`_
`UuidCreate <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/rpc/rpc/uuidcreate.asp>`_
`UuidToString <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/rpc/rpc/uuidtostring.asp>`_
Database Objects
----------------
Database.OpenView(sql)~
Return a view object, by calling MSIDatabaseOpenView. {sql} is the SQL
statement to execute.
Database.Commit()~
Commit the changes pending in the current transaction, by calling
MSIDatabaseCommit.
Database.GetSummaryInformation(count)~
Return a new summary information object, by calling
MsiGetSummaryInformation. {count} is the maximum number of updated
values.
.. seealso::
`MSIDatabaseOpenView <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msidatabaseopenview.asp>`_
`MSIDatabaseCommit <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msidatabasecommit.asp>`_
`MSIGetSummaryInformation <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msigetsummaryinformation.asp>`_
View Objects
------------
View.Execute(params)~
Execute the SQL query of the view, through MSIViewExecute. If
{params} is not ``None``, it is a record describing actual values of the
parameter tokens in the query.
View.GetColumnInfo(kind)~
Return a record describing the columns of the view, through calling
MsiViewGetColumnInfo. {kind} can be either ``MSICOLINFO_NAMES`` or
``MSICOLINFO_TYPES``.
View.Fetch()~
Return a result record of the query, through calling MsiViewFetch.
View.Modify(kind, data)~
Modify the view, by calling MsiViewModify. {kind} can be one of
``MSIMODIFY_SEEK``, ``MSIMODIFY_REFRESH``, ``MSIMODIFY_INSERT``,
``MSIMODIFY_UPDATE``, ``MSIMODIFY_ASSIGN``, ``MSIMODIFY_REPLACE``,
``MSIMODIFY_MERGE``, ``MSIMODIFY_DELETE``, ``MSIMODIFY_INSERT_TEMPORARY``,
``MSIMODIFY_VALIDATE``, ``MSIMODIFY_VALIDATE_NEW``,
``MSIMODIFY_VALIDATE_FIELD``, or ``MSIMODIFY_VALIDATE_DELETE``.
{data} must be a record describing the new data.
View.Close()~
Close the view, through MsiViewClose.
.. seealso::
`MsiViewExecute <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msiviewexecute.asp>`_
`MSIViewGetColumnInfo <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msiviewgetcolumninfo.asp>`_
`MsiViewFetch <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msiviewfetch.asp>`_
`MsiViewModify <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msiviewmodify.asp>`_
`MsiViewClose <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msiviewclose.asp>`_
Summary Information Objects
---------------------------
SummaryInformation.GetProperty(field)~
Return a property of the summary, through MsiSummaryInfoGetProperty.
{field} is the name of the property, and can be one of the constants
``PID_CODEPAGE``, ``PID_TITLE``, ``PID_SUBJECT``, ``PID_AUTHOR``,
``PID_KEYWORDS``, ``PID_COMMENTS``, ``PID_TEMPLATE``, ``PID_LASTAUTHOR``,
``PID_REVNUMBER``, ``PID_LASTPRINTED``, ``PID_CREATE_DTM``,
``PID_LASTSAVE_DTM``, ``PID_PAGECOUNT``, ``PID_WORDCOUNT``, ``PID_CHARCOUNT``,
``PID_APPNAME``, or ``PID_SECURITY``.
SummaryInformation.GetPropertyCount()~
Return the number of summary properties, through
MsiSummaryInfoGetPropertyCount.
SummaryInformation.SetProperty(field, value)~
Set a property through MsiSummaryInfoSetProperty. {field} can have the
same values as in GetProperty, {value} is the new value of the property.
Possible value types are integer and string.
SummaryInformation.Persist()~
Write the modified properties to the summary information stream, using
MsiSummaryInfoPersist.
.. seealso::
`MsiSummaryInfoGetProperty <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msisummaryinfogetproperty.asp>`_
`MsiSummaryInfoGetPropertyCount <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msisummaryinfogetpropertycount.asp>`_
`MsiSummaryInfoSetProperty <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msisummaryinfosetproperty.asp>`_
`MsiSummaryInfoPersist <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msisummaryinfopersist.asp>`_
Record Objects
--------------
Record.GetFieldCount()~
Return the number of fields of the record, through
MsiRecordGetFieldCount.
Record.GetInteger(field)~
Return the value of {field} as an integer where possible. {field} must
be an integer.
Record.GetString(field)~
Return the value of {field} as a string where possible. {field} must
be an integer.
Record.SetString(field, value)~
Set {field} to {value} through MsiRecordSetString. {field} must be an
integer; {value} a string.
Record.SetStream(field, value)~
Set {field} to the contents of the file named {value}, through
MsiRecordSetStream. {field} must be an integer; {value} a string.
Record.SetInteger(field, value)~
Set {field} to {value} through MsiRecordSetInteger. Both {field} and
{value} must be an integer.
Record.ClearData()~
Set all fields of the record to 0, through MsiRecordClearData.
.. seealso::
`MsiRecordGetFieldCount <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msirecordgetfieldcount.asp>`_
`MsiRecordSetString <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msirecordsetstring.asp>`_
`MsiRecordSetStream <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msirecordsetstream.asp>`_
`MsiRecordSetInteger <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msirecordsetinteger.asp>`_
`MsiRecordClear <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msirecordclear.asp>`_
Errors
------
All wrappers around MSI functions raise MsiError; the string inside the
exception will contain more detail.
CAB Objects
-----------
CAB(name)~
The class CAB represents a CAB file. During MSI construction, files
will be added simultaneously to the ``Files`` table, and to a CAB file. Then,
when all files have been added, the CAB file can be written, then added to the
MSI file.
{name} is the name of the CAB file in the MSI file.
append(full, file, logical)~
Add the file with the pathname {full} to the CAB file, under the name
{logical}. If there is already a file named {logical}, a new file name is
created.
Return the index of the file in the CAB file, and the new name of the file
inside the CAB file.
commit(database)~
Generate a CAB file, add it as a stream to the MSI file, put it into the
``Media`` table, and remove the generated file from the disk.
Directory Objects
-----------------
Directory(database, cab, basedir, physical, logical, default, component, [componentflags])~
Create a new directory in the Directory table. There is a current component at
each point in time for the directory, which is either explicitly created through
start_component, or implicitly when files are added for the first time.
Files are added into the current component, and into the cab file. To create a
directory, a base directory object needs to be specified (can be ``None``), the
path to the physical directory, and a logical directory name. {default}
specifies the DefaultDir slot in the directory table. {componentflags} specifies
the default flags that new components get.
start_component([component[, feature[, flags[, keyfile[, uuid]]]]])~
Add an entry to the Component table, and make this component the current
component for this directory. If no component name is given, the directory
name is used. If no {feature} is given, the current feature is used. If no
{flags} are given, the directory's default flags are used. If no {keyfile}
is given, the KeyPath is left null in the Component table.
add_file(file[, src[, version[, language]]])~
Add a file to the current component of the directory, starting a new one
if there is no current component. By default, the file name in the source
and the file table will be identical. If the {src} file is specified, it
is interpreted relative to the current directory. Optionally, a {version}
and a {language} can be specified for the entry in the File table.
glob(pattern[, exclude])~
Add a list of files to the current component as specified in the glob
pattern. Individual files can be excluded in the {exclude} list.
remove_pyc()~
Remove ``.pyc``/``.pyo`` files on uninstall.
.. seealso::
`Directory Table <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/directory_table.asp>`_
`File Table <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/file_table.asp>`_
`Component Table <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/component_table.asp>`_
`FeatureComponents Table <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/featurecomponents_table.asp>`_
Features
--------
Feature(database, id, title, desc, display[, level=1[, parent[, directory[, attributes=0]]]])~
Add a new record to the ``Feature`` table, using the values {id}, {parent.id},
{title}, {desc}, {display}, {level}, {directory}, and {attributes}. The
resulting feature object can be passed to the start_component method of
Directory.
set_current()~
Make this feature the current feature of msilib (|py2stdlib-msilib|). New components are
automatically added to the default feature, unless a feature is explicitly
specified.
.. seealso::
`Feature Table <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/feature_table.asp>`_
GUI classes
-----------
msilib (|py2stdlib-msilib|) provides several classes that wrap the GUI tables in an MSI
database. However, no standard user interface is provided; use bdist_msi
to create MSI files with a user-interface for installing Python packages.
Control(dlg, name)~
Base class of the dialog controls. {dlg} is the dialog object the control
belongs to, and {name} is the control's name.
event(event, argument[, condition=1[, ordering]])~
Make an entry into the ``ControlEvent`` table for this control.
mapping(event, attribute)~
Make an entry into the ``EventMapping`` table for this control.
condition(action, condition)~
Make an entry into the ``ControlCondition`` table for this control.
RadioButtonGroup(dlg, name, property)~
Create a radio button control named {name}. {property} is the installer property
that gets set when a radio button is selected.
add(name, x, y, width, height, text [, value])~
Add a radio button named {name} to the group, at the coordinates {x}, {y},
{width}, {height}, and with the label {text}. If {value} is omitted, it
defaults to {name}.
Dialog(db, name, x, y, w, h, attr, title, first, default, cancel)~
Return a new Dialog object. An entry in the ``Dialog`` table is made,
with the specified coordinates, dialog attributes, title, name of the first,
default, and cancel controls.
control(name, type, x, y, width, height, attributes, property, text, control_next, help)~
Return a new Control object. An entry in the ``Control`` table is
made with the specified parameters.
This is a generic method; for specific types, specialized methods are
provided.
text(name, x, y, width, height, attributes, text)~
Add and return a ``Text`` control.
bitmap(name, x, y, width, height, text)~
Add and return a ``Bitmap`` control.
line(name, x, y, width, height)~
Add and return a ``Line`` control.
pushbutton(name, x, y, width, height, attributes, text, next_control)~
Add and return a ``PushButton`` control.
radiogroup(name, x, y, width, height, attributes, property, text, next_control)~
Add and return a ``RadioButtonGroup`` control.
checkbox(name, x, y, width, height, attributes, property, text, next_control)~
Add and return a ``CheckBox`` control.
.. seealso::
`Dialog Table <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/dialog_table.asp>`_
`Control Table <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/control_table.asp>`_
`Control Types <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/controls.asp>`_
`ControlCondition Table <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/controlcondition_table.asp>`_
`ControlEvent Table <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/controlevent_table.asp>`_
`EventMapping Table <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/eventmapping_table.asp>`_
`RadioButton Table <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/radiobutton_table.asp>`_
Precomputed tables
------------------
msilib (|py2stdlib-msilib|) provides a few subpackages that contain only schema and table
definitions. Currently, these definitions are based on MSI version 2.0.
schema~
This is the standard MSI schema for MSI 2.0, with the {tables} variable
providing a list of table definitions, and {_Validation_records} providing the
data for MSI validation.
sequence~
This module contains table contents for the standard sequence tables:
{AdminExecuteSequence}, {AdminUISequence}, {AdvtExecuteSequence},
{InstallExecuteSequence}, and {InstallUISequence}.
text~
This module contains definitions for the UIText and ActionText tables, for the
standard installer actions.
==============================================================================
*py2stdlib-msvcrt*
msvcrt~
:platform: Windows
:synopsis: Miscellaneous useful routines from the MS VC++ runtime.
These functions provide access to some useful capabilities on Windows platforms.
Some higher-level modules use these functions to build the Windows
implementations of their services. For example, the getpass (|py2stdlib-getpass|) module uses
this in the implementation of the getpass (|py2stdlib-getpass|) function.
Further documentation on these functions can be found in the Platform API
documentation.
The module implements both the normal and wide char variants of the console I/O
api. The normal API deals only with ASCII characters and is of limited use
for internationalized applications. The wide char API should be used where
ever possible
File Operations
---------------
locking(fd, mode, nbytes)~
Lock part of a file based on file descriptor {fd} from the C runtime. Raises
IOError on failure. The locked region of the file extends from the
current file position for {nbytes} bytes, and may continue beyond the end of the
file. {mode} must be one of the LK_\* constants listed below. Multiple
regions in a file may be locked at the same time, but may not overlap. Adjacent
regions are not merged; they must be unlocked individually.
LK_LOCK~
LK_RLCK
Locks the specified bytes. If the bytes cannot be locked, the program
immediately tries again after 1 second. If, after 10 attempts, the bytes cannot
be locked, IOError is raised.
LK_NBLCK~
LK_NBRLCK
Locks the specified bytes. If the bytes cannot be locked, IOError is
raised.
LK_UNLCK~
Unlocks the specified bytes, which must have been previously locked.
setmode(fd, flags)~
Set the line-end translation mode for the file descriptor {fd}. To set it to
text mode, {flags} should be os.O_TEXT; for binary, it should be
os.O_BINARY.
open_osfhandle(handle, flags)~
Create a C runtime file descriptor from the file handle {handle}. The {flags}
parameter should be a bitwise OR of os.O_APPEND, os.O_RDONLY,
and os.O_TEXT. The returned file descriptor may be used as a parameter
to os.fdopen to create a file object.
get_osfhandle(fd)~
Return the file handle for the file descriptor {fd}. Raises IOError if
{fd} is not recognized.
Console I/O
-----------
kbhit()~
Return true if a keypress is waiting to be read.
getch()~
Read a keypress and return the resulting character. Nothing is echoed to the
console. This call will block if a keypress is not already available, but will
not wait for Enter to be pressed. If the pressed key was a special
function key, this will return ``'\000'`` or ``'\xe0'``; the next call will
return the keycode. The Control-C keypress cannot be read with this
function.
getwch()~
Wide char variant of getch, returning a Unicode value.
.. versionadded:: 2.6
getche()~
Similar to getch, but the keypress will be echoed if it represents a
printable character.
getwche()~
Wide char variant of getche, returning a Unicode value.
.. versionadded:: 2.6
putch(char)~
Print the character {char} to the console without buffering.
putwch(unicode_char)~
Wide char variant of putch, accepting a Unicode value.
.. versionadded:: 2.6
ungetch(char)~
Cause the character {char} to be "pushed back" into the console buffer; it will
be the next character read by getch or getche.
ungetwch(unicode_char)~
Wide char variant of ungetch, accepting a Unicode value.
.. versionadded:: 2.6
Other Functions
---------------
heapmin()~
Force the malloc heap to clean itself up and return unused blocks to
the operating system. On failure, this raises IOError.
==============================================================================
*py2stdlib-multifile*
multifile~
:synopsis: Support for reading files which contain distinct parts, such as some MIME data.
:deprecated:
2.5~
The email (|py2stdlib-email|) package should be used in preference to the multifile (|py2stdlib-multifile|)
module. This module is present only to maintain backward compatibility.
The MultiFile object enables you to treat sections of a text file as
file-like input objects, with ``''`` being returned by readline (|py2stdlib-readline|) when a
given delimiter pattern is encountered. The defaults of this class are designed
to make it useful for parsing MIME multipart messages, but by subclassing it and
overriding methods it can be easily adapted for more general use.
MultiFile(fp[, seekable])~
Create a multi-file. You must instantiate this class with an input object
argument for the MultiFile instance to get lines from, such as a file
object returned by open.
MultiFile only ever looks at the input object's readline (|py2stdlib-readline|),
seek and tell methods, and the latter two are only needed if you
want random access to the individual MIME parts. To use MultiFile on a
non-seekable stream object, set the optional {seekable} argument to false; this
will prevent using the input object's seek and tell methods.
It will be useful to know that in MultiFile's view of the world, text
is composed of three kinds of lines: data, section-dividers, and end-markers.
MultiFile is designed to support parsing of messages that may have multiple
nested message parts, each with its own pattern for section-divider and
end-marker lines.
.. seealso::
Module email (|py2stdlib-email|)
Comprehensive email handling package; supersedes the multifile (|py2stdlib-multifile|) module.
MultiFile Objects
-----------------
A MultiFile instance has the following methods:
MultiFile.readline(str)~
Read a line. If the line is data (not a section-divider or end-marker or real
EOF) return it. If the line matches the most-recently-stacked boundary, return
``''`` and set ``self.last`` to 1 or 0 according as the match is or is not an
end-marker. If the line matches any other stacked boundary, raise an error. On
encountering end-of-file on the underlying stream object, the method raises
Error unless all boundaries have been popped.
MultiFile.readlines(str)~
Return all lines remaining in this part as a list of strings.
MultiFile.read()~
Read all lines, up to the next section. Return them as a single (multiline)
string. Note that this doesn't take a size argument!
MultiFile.seek(pos[, whence])~
Seek. Seek indices are relative to the start of the current section. The {pos}
and {whence} arguments are interpreted as for a file seek.
MultiFile.tell()~
Return the file position relative to the start of the current section.
MultiFile.next()~
Skip lines to the next section (that is, read lines until a section-divider or
end-marker has been consumed). Return true if there is such a section, false if
an end-marker is seen. Re-enable the most-recently-pushed boundary.
MultiFile.is_data(str)~
Return true if {str} is data and false if it might be a section boundary. As
written, it tests for a prefix other than ``'-``\ ``-'`` at start of line (which
all MIME boundaries have) but it is declared so it can be overridden in derived
classes.
Note that this test is used intended as a fast guard for the real boundary
tests; if it always returns false it will merely slow processing, not cause it
to fail.
MultiFile.push(str)~
Push a boundary string. When a decorated version of this boundary is found as
an input line, it will be interpreted as a section-divider or end-marker
(depending on the decoration, see 2045). All subsequent reads will
return the empty string to indicate end-of-file, until a call to pop
removes the boundary a or .next call reenables it.
It is possible to push more than one boundary. Encountering the
most-recently-pushed boundary will return EOF; encountering any other
boundary will raise an error.
MultiFile.pop()~
Pop a section boundary. This boundary will no longer be interpreted as EOF.
MultiFile.section_divider(str)~
Turn a boundary into a section-divider line. By default, this method
prepends ``'--'`` (which MIME section boundaries have) but it is declared so
it can be overridden in derived classes. This method need not append LF or
CR-LF, as comparison with the result ignores trailing whitespace.
MultiFile.end_marker(str)~
Turn a boundary string into an end-marker line. By default, this method
prepends ``'--'`` and appends ``'--'`` (like a MIME-multipart end-of-message
marker) but it is declared so it can be overridden in derived classes. This
method need not append LF or CR-LF, as comparison with the result ignores
trailing whitespace.
Finally, MultiFile instances have two public instance variables:
MultiFile.level~
Nesting depth of the current part.
MultiFile.last~
True if the last end-of-file was for an end-of-message marker.
MultiFile Example
--------------------------
:: >
import mimetools
import multifile
import StringIO
def extract_mime_part_matching(stream, mimetype):
"""Return the first element in a multipart MIME message on stream
matching mimetype."""
msg = mimetools.Message(stream)
msgtype = msg.gettype()
params = msg.getplist()
data = StringIO.StringIO()
if msgtype[:10] == "multipart/":
file = multifile.MultiFile(stream)
file.push(msg.getparam("boundary"))
while file.next():
submsg = mimetools.Message(file)
try:
data = StringIO.StringIO()
mimetools.decode(file, data, submsg.getencoding())
except ValueError:
continue
if submsg.gettype() == mimetype:
break
file.pop()
return data.getvalue()
==============================================================================
*py2stdlib-multiprocessing*
multiprocessing~
:synopsis: Process-based "threading" interface.
.. versionadded:: 2.6
Introduction
------------
multiprocessing (|py2stdlib-multiprocessing|) is a package that supports spawning processes using an
API similar to the threading (|py2stdlib-threading|) module. The multiprocessing (|py2stdlib-multiprocessing|) package
offers both local and remote concurrency, effectively side-stepping the
Global Interpreter Lock by using subprocesses instead of threads. Due
to this, the multiprocessing (|py2stdlib-multiprocessing|) module allows the programmer to fully
leverage multiple processors on a given machine. It runs on both Unix and
Windows.
.. warning::
Some of this package's functionality requires a functioning shared semaphore
implementation on the host operating system. Without one, the
multiprocessing.synchronize module will be disabled, and attempts to
import it will result in an ImportError. See
3770 for additional information.
.. note::
Functionality within this package requires that the ``__main__`` method be
importable by the children. This is covered in multiprocessing-programming
however it is worth pointing out here. This means that some examples, such
as the multiprocessing.Pool examples will not work in the
interactive interpreter. For example:: >
>>> from multiprocessing import Pool
>>> p = Pool(5)
>>> def f(x):
... return x*x
...
>>> p.map(f, [1,2,3])
Process PoolWorker-1:
Process PoolWorker-2:
Process PoolWorker-3:
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
AttributeError: 'module' object has no attribute 'f'
AttributeError: 'module' object has no attribute 'f'
AttributeError: 'module' object has no attribute 'f'
<
(If you try this it will actually output three full tracebacks
interleaved in a semi-random fashion, and then you may have to
stop the master process somehow.)
The Process class
~~~~~~~~~~~~~~~~~~~~~~~~~~
In multiprocessing (|py2stdlib-multiprocessing|), processes are spawned by creating a Process
object and then calling its Process.start method. Process
follows the API of threading.Thread. A trivial example of a
multiprocess program is :: >
from multiprocessing import Process
def f(name):
print 'hello', name
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()
<
To show the individual process IDs involved, here is an expanded example::
from multiprocessing import Process
import os
def info(title):
print title
print 'module name:', __name__
print 'parent process:', os.getppid()
print 'process id:', os.getpid()
def f(name):
info('function f')
print 'hello', name
if __name__ == '__main__':
info('main line')
p = Process(target=f, args=('bob',))
p.start()
p.join()
For an explanation of why (on Windows) the ``if __name__ == '__main__'`` part is
necessary, see multiprocessing-programming.
Exchanging objects between processes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
multiprocessing (|py2stdlib-multiprocessing|) supports two types of communication channel between
processes:
{Queues}*
The Queue (|py2stdlib-queue|) class is a near clone of Queue.Queue. For
example:: >
from multiprocessing import Process, Queue
def f(q):
q.put([42, None, 'hello'])
if __name__ == '__main__':
q = Queue()
p = Process(target=f, args=(q,))
p.start()
print q.get() # prints "[42, None, 'hello']"
p.join()
<
Queues are thread and process safe.
{Pipes}*
The Pipe function returns a pair of connection objects connected by a
pipe which by default is duplex (two-way). For example:: >
from multiprocessing import Process, Pipe
def f(conn):
conn.send([42, None, 'hello'])
conn.close()
if __name__ == '__main__':
parent_conn, child_conn = Pipe()
p = Process(target=f, args=(child_conn,))
p.start()
print parent_conn.recv() # prints "[42, None, 'hello']"
p.join()
<
The two connection objects returned by Pipe represent the two ends of
the pipe. Each connection object has Connection.send and
Connection.recv methods (among others). Note that data in a pipe
may become corrupted if two processes (or threads) try to read from or write
to the {same} end of the pipe at the same time. Of course there is no risk
of corruption from processes using different ends of the pipe at the same
time.
Synchronization between processes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
multiprocessing (|py2stdlib-multiprocessing|) contains equivalents of all the synchronization
primitives from threading (|py2stdlib-threading|). For instance one can use a lock to ensure
that only one process prints to standard output at a time:: >
from multiprocessing import Process, Lock
def f(l, i):
l.acquire()
print 'hello world', i
l.release()
if __name__ == '__main__':
lock = Lock()
for num in range(10):
Process(target=f, args=(lock, num)).start()
<
Without using the lock output from the different processes is liable to get all
mixed up.
Sharing state between processes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
As mentioned above, when doing concurrent programming it is usually best to
avoid using shared state as far as possible. This is particularly true when
using multiple processes.
However, if you really do need to use some shared data then
multiprocessing (|py2stdlib-multiprocessing|) provides a couple of ways of doing so.
{Shared memory}*
Data can be stored in a shared memory map using Value or
Array. For example, the following code :: >
from multiprocessing import Process, Value, Array
def f(n, a):
n.value = 3.1415927
for i in range(len(a)):
a[i] = -a[i]
if __name__ == '__main__':
num = Value('d', 0.0)
arr = Array('i', range(10))
p = Process(target=f, args=(num, arr))
p.start()
p.join()
print num.value
print arr[:]
<
will print ::
3.1415927
[0, -1, -2, -3, -4, -5, -6, -7, -8, -9]
The ``'d'`` and ``'i'`` arguments used when creating ``num`` and ``arr`` are
typecodes of the kind used by the array (|py2stdlib-array|) module: ``'d'`` indicates a
double precision float and ``'i'`` indicates a signed integer. These shared
objects will be process and thread safe.
For more flexibility in using shared memory one can use the
multiprocessing.sharedctypes (|py2stdlib-multiprocessing.sharedctypes|) module which supports the creation of
arbitrary ctypes objects allocated from shared memory.
{Server process}*
A manager object returned by Manager controls a server process which
holds Python objects and allows other processes to manipulate them using
proxies.
A manager returned by Manager will support types list,
dict, Namespace, Lock, RLock,
Semaphore, BoundedSemaphore, Condition,
Event, Queue (|py2stdlib-queue|), Value and Array. For
example, :: >
from multiprocessing import Process, Manager
def f(d, l):
d[1] = '1'
d['2'] = 2
d[0.25] = None
l.reverse()
if __name__ == '__main__':
manager = Manager()
d = manager.dict()
l = manager.list(range(10))
p = Process(target=f, args=(d, l))
p.start()
p.join()
print d
print l
<
will print ::
{0.25: None, 1: '1', '2': 2}
[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
Server process managers are more flexible than using shared memory objects
because they can be made to support arbitrary object types. Also, a single
manager can be shared by processes on different computers over a network.
They are, however, slower than using shared memory.
Using a pool of workers
~~~~~~~~~~~~~~~~~~~~~~~
The multiprocessing.pool.Pool class represents a pool of worker
processes. It has methods which allows tasks to be offloaded to the worker
processes in a few different ways.
For example:: >
from multiprocessing import Pool
def f(x):
return x*x
if __name__ == '__main__':
pool = Pool(processes=4) # start 4 worker processes
result = pool.apply_async(f, [10]) # evaluate "f(10)" asynchronously
print result.get(timeout=1) # prints "100" unless your computer is {very} slow
print pool.map(f, range(10)) # prints "[0, 1, 4,..., 81]"
<
Reference
The multiprocessing (|py2stdlib-multiprocessing|) package mostly replicates the API of the
threading (|py2stdlib-threading|) module.
Process and exceptions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Process([group[, target[, name[, args[, kwargs]]]]])~
Process objects represent activity that is run in a separate process. The
Process class has equivalents of all the methods of
threading.Thread.
The constructor should always be called with keyword arguments. {group}
should always be ``None``; it exists solely for compatibility with
threading.Thread. {target} is the callable object to be invoked by
the run() method. It defaults to ``None``, meaning nothing is
called. {name} is the process name. By default, a unique name is constructed
of the form 'Process-N\ 1:N\ 2:...:N\ k' where N\
1,N\ 2,...,N\ k is a sequence of integers whose length
is determined by the {generation} of the process. {args} is the argument
tuple for the target invocation. {kwargs} is a dictionary of keyword
arguments for the target invocation. By default, no arguments are passed to
{target}.
If a subclass overrides the constructor, it must make sure it invokes the
base class constructor (Process.__init__) before doing anything else
to the process.
run()~
Method representing the process's activity.
You may override this method in a subclass. The standard run
method invokes the callable object passed to the object's constructor as
the target argument, if any, with sequential and keyword arguments taken
from the {args} and {kwargs} arguments, respectively.
start()~
Start the process's activity.
This must be called at most once per process object. It arranges for the
object's run method to be invoked in a separate process.
join([timeout])~
Block the calling thread until the process whose join method is
called terminates or until the optional timeout occurs.
If {timeout} is ``None`` then there is no timeout.
A process can be joined many times.
A process cannot join itself because this would cause a deadlock. It is
an error to attempt to join a process before it has been started.
name~
The process's name.
The name is a string used for identification purposes only. It has no
semantics. Multiple processes may be given the same name. The initial
name is set by the constructor.
is_alive~
Return whether the process is alive.
Roughly, a process object is alive from the moment the start
method returns until the child process terminates.
daemon~
The process's daemon flag, a Boolean value. This must be set before
start is called.
The initial value is inherited from the creating process.
When a process exits, it attempts to terminate all of its daemonic child
processes.
Note that a daemonic process is not allowed to create child processes.
Otherwise a daemonic process would leave its children orphaned if it gets
terminated when its parent process exits. Additionally, these are {not}*
Unix daemons or services, they are normal processes that will be
terminated (and not joined) if non-dameonic processes have exited.
In addition to the Threading.Thread API, Process objects
also support the following attributes and methods:
pid~
Return the process ID. Before the process is spawned, this will be
``None``.
exitcode~
The child's exit code. This will be ``None`` if the process has not yet
terminated. A negative value {-N} indicates that the child was terminated
by signal {N}.
authkey~
The process's authentication key (a byte string).
When multiprocessing (|py2stdlib-multiprocessing|) is initialized the main process is assigned a
random string using os.random.
When a Process object is created, it will inherit the
authentication key of its parent process, although this may be changed by
setting authkey to another byte string.
See multiprocessing-auth-keys.
terminate()~
Terminate the process. On Unix this is done using the ``SIGTERM`` signal;
on Windows TerminateProcess is used. Note that exit handlers and
finally clauses, etc., will not be executed.
Note that descendant processes of the process will {not} be terminated --
they will simply become orphaned.
.. warning:: >
If this method is used when the associated process is using a pipe or
queue then the pipe or queue is liable to become corrupted and may
become unusable by other process. Similarly, if the process has
acquired a lock or semaphore etc. then terminating it is liable to
cause other processes to deadlock.
<
Note that the start, join, is_alive and
exit_code methods should only be called by the process that created
the process object.
Example usage of some of the methods of Process:
.. doctest:: >
>>> import multiprocessing, time, signal
>>> p = multiprocessing.Process(target=time.sleep, args=(1000,))
>>> print p, p.is_alive()
<Process(Process-1, initial)> False
>>> p.start()
>>> print p, p.is_alive()
<Process(Process-1, started)> True
>>> p.terminate()
>>> time.sleep(0.1)
>>> print p, p.is_alive()
<Process(Process-1, stopped[SIGTERM])> False
>>> p.exitcode == -signal.SIGTERM
True
<
BufferTooShort~
Exception raised by Connection.recv_bytes_into() when the supplied
buffer object is too small for the message read.
If ``e`` is an instance of BufferTooShort then ``e.args[0]`` will give
the message as a byte string.
Pipes and Queues
~~~~~~~~~~~~~~~~
When using multiple processes, one generally uses message passing for
communication between processes and avoids having to use any synchronization
primitives like locks.
For passing messages one can use Pipe (for a connection between two
processes) or a queue (which allows multiple producers and consumers).
The Queue (|py2stdlib-queue|) and JoinableQueue types are multi-producer,
multi-consumer FIFO queues modelled on the Queue.Queue class in the
standard library. They differ in that Queue (|py2stdlib-queue|) lacks the
Queue.Queue.task_done and Queue.Queue.join methods introduced
into Python 2.5's Queue.Queue class.
If you use JoinableQueue then you {must}* call
JoinableQueue.task_done for each task removed from the queue or else the
semaphore used to count the number of unfinished tasks may eventually overflow
raising an exception.
Note that one can also create a shared queue by using a manager object -- see
multiprocessing-managers.
.. note::
multiprocessing (|py2stdlib-multiprocessing|) uses the usual Queue.Empty and
Queue.Full exceptions to signal a timeout. They are not available in
the multiprocessing (|py2stdlib-multiprocessing|) namespace so you need to import them from
Queue (|py2stdlib-queue|).
.. warning::
If a process is killed using Process.terminate or os.kill
while it is trying to use a Queue (|py2stdlib-queue|), then the data in the queue is
likely to become corrupted. This may cause any other processes to get an
exception when it tries to use the queue later on.
.. warning::
As mentioned above, if a child process has put items on a queue (and it has
not used JoinableQueue.cancel_join_thread), then that process will
not terminate until all buffered items have been flushed to the pipe.
This means that if you try joining that process you may get a deadlock unless
you are sure that all items which have been put on the queue have been
consumed. Similarly, if the child process is non-daemonic then the parent
process may hang on exit when it tries to join all its non-daemonic children.
Note that a queue created using a manager does not have this issue. See
multiprocessing-programming.
For an example of the usage of queues for interprocess communication see
multiprocessing-examples.
Pipe([duplex])~
Returns a pair ``(conn1, conn2)`` of Connection objects representing
the ends of a pipe.
If {duplex} is ``True`` (the default) then the pipe is bidirectional. If
{duplex} is ``False`` then the pipe is unidirectional: ``conn1`` can only be
used for receiving messages and ``conn2`` can only be used for sending
messages.
Queue([maxsize])~
Returns a process shared queue implemented using a pipe and a few
locks/semaphores. When a process first puts an item on the queue a feeder
thread is started which transfers objects from a buffer into the pipe.
The usual Queue.Empty and Queue.Full exceptions from the
standard library's Queue (|py2stdlib-queue|) module are raised to signal timeouts.
Queue (|py2stdlib-queue|) implements all the methods of Queue.Queue except for
Queue.Queue.task_done and Queue.Queue.join.
qsize()~
Return the approximate size of the queue. Because of
multithreading/multiprocessing semantics, this number is not reliable.
Note that this may raise NotImplementedError on Unix platforms like
Mac OS X where ``sem_getvalue()`` is not implemented.
empty()~
Return ``True`` if the queue is empty, ``False`` otherwise. Because of
multithreading/multiprocessing semantics, this is not reliable.
full()~
Return ``True`` if the queue is full, ``False`` otherwise. Because of
multithreading/multiprocessing semantics, this is not reliable.
put(item[, block[, timeout]])~
Put item into the queue. If the optional argument {block} is ``True``
(the default) and {timeout} is ``None`` (the default), block if necessary until
a free slot is available. If {timeout} is a positive number, it blocks at
most {timeout} seconds and raises the Queue.Full exception if no
free slot was available within that time. Otherwise ({block} is
``False``), put an item on the queue if a free slot is immediately
available, else raise the Queue.Full exception ({timeout} is
ignored in that case).
put_nowait(item)~
Equivalent to ``put(item, False)``.
get([block[, timeout]])~
Remove and return an item from the queue. If optional args {block} is
``True`` (the default) and {timeout} is ``None`` (the default), block if
necessary until an item is available. If {timeout} is a positive number,
it blocks at most {timeout} seconds and raises the Queue.Empty
exception if no item was available within that time. Otherwise (block is
``False``), return an item if one is immediately available, else raise the
Queue.Empty exception ({timeout} is ignored in that case).
get_nowait()~
get_no_wait()
Equivalent to ``get(False)``.
multiprocessing.Queue has a few additional methods not found in
Queue.Queue. These methods are usually unnecessary for most
code:
close()~
Indicate that no more data will be put on this queue by the current
process. The background thread will quit once it has flushed all buffered
data to the pipe. This is called automatically when the queue is garbage
collected.
join_thread()~
Join the background thread. This can only be used after close has
been called. It blocks until the background thread exits, ensuring that
all data in the buffer has been flushed to the pipe.
By default if a process is not the creator of the queue then on exit it
will attempt to join the queue's background thread. The process can call
cancel_join_thread to make join_thread do nothing.
cancel_join_thread()~
Prevent join_thread from blocking. In particular, this prevents
the background thread from being joined automatically when the process
exits -- see join_thread.
JoinableQueue([maxsize])~
JoinableQueue, a Queue (|py2stdlib-queue|) subclass, is a queue which
additionally has task_done and join methods.
task_done()~
Indicate that a formerly enqueued task is complete. Used by queue consumer
threads. For each Queue.get used to fetch a task, a subsequent
call to task_done tells the queue that the processing on the task
is complete.
If a Queue.join is currently blocking, it will resume when all
items have been processed (meaning that a task_done call was
received for every item that had been Queue.put into the queue).
Raises a ValueError if called more times than there were items
placed in the queue.
join()~
Block until all items in the queue have been gotten and processed.
The count of unfinished tasks goes up whenever an item is added to the
queue. The count goes down whenever a consumer thread calls
task_done to indicate that the item was retrieved and all work on
it is complete. When the count of unfinished tasks drops to zero,
Queue.join unblocks.
Miscellaneous
~~~~~~~~~~~~~
active_children()~
Return list of all live children of the current process.
Calling this has the side affect of "joining" any processes which have
already finished.
cpu_count()~
Return the number of CPUs in the system. May raise
NotImplementedError.
current_process()~
Return the Process object corresponding to the current process.
An analogue of threading.current_thread.
freeze_support()~
Add support for when a program which uses multiprocessing (|py2stdlib-multiprocessing|) has been
frozen to produce a Windows executable. (Has been tested with {py2exe}*,
{PyInstaller}{ and }{cx_Freeze}*.)
One needs to call this function straight after the ``if __name__ ==
'__main__'`` line of the main module. For example:: >
from multiprocessing import Process, freeze_support
def f():
print 'hello world!'
if __name__ == '__main__':
freeze_support()
Process(target=f).start()
<
If the ``freeze_support()`` line is omitted then trying to run the frozen
executable will raise RuntimeError.
If the module is being run normally by the Python interpreter then
freeze_support has no effect.
set_executable()~
Sets the path of the Python interpreter to use when starting a child process.
(By default sys.executable is used). Embedders will probably need to
do some thing like :: >
setExecutable(os.path.join(sys.exec_prefix, 'pythonw.exe'))
<
before they can create child processes. (Windows only)
.. note::
multiprocessing (|py2stdlib-multiprocessing|) contains no analogues of
threading.active_count, threading.enumerate,
threading.settrace, threading.setprofile,
threading.Timer, or threading.local.
Connection Objects
~~~~~~~~~~~~~~~~~~
Connection objects allow the sending and receiving of picklable objects or
strings. They can be thought of as message oriented connected sockets.
Connection objects usually created using Pipe -- see also
multiprocessing-listeners-clients.
Connection~
send(obj)~
Send an object to the other end of the connection which should be read
using recv.
The object must be picklable. Very large pickles (approximately 32 MB+,
though it depends on the OS) may raise a ValueError exception.
recv()~
Return an object sent from the other end of the connection using
send. Raises EOFError if there is nothing left to receive
and the other end was closed.
fileno()~
Returns the file descriptor or handle used by the connection.
close()~
Close the connection.
This is called automatically when the connection is garbage collected.
poll([timeout])~
Return whether there is any data available to be read.
If {timeout} is not specified then it will return immediately. If
{timeout} is a number then this specifies the maximum time in seconds to
block. If {timeout} is ``None`` then an infinite timeout is used.
send_bytes(buffer[, offset[, size]])~
Send byte data from an object supporting the buffer interface as a
complete message.
If {offset} is given then data is read from that position in {buffer}. If
{size} is given then that many bytes will be read from buffer. Very large
buffers (approximately 32 MB+, though it depends on the OS) may raise a
ValueError exception
recv_bytes([maxlength])~
Return a complete message of byte data sent from the other end of the
connection as a string. Raises EOFError if there is nothing left
to receive and the other end has closed.
If {maxlength} is specified and the message is longer than {maxlength}
then IOError is raised and the connection will no longer be
readable.
recv_bytes_into(buffer[, offset])~
Read into {buffer} a complete message of byte data sent from the other end
of the connection and return the number of bytes in the message. Raises
EOFError if there is nothing left to receive and the other end was
closed.
{buffer} must be an object satisfying the writable buffer interface. If
{offset} is given then the message will be written into the buffer from
that position. Offset must be a non-negative integer less than the
length of {buffer} (in bytes).
If the buffer is too short then a BufferTooShort exception is
raised and the complete message is available as ``e.args[0]`` where ``e``
is the exception instance.
For example:
.. doctest::
>>> from multiprocessing import Pipe
>>> a, b = Pipe()
>>> a.send([1, 'hello', None])
>>> b.recv()
[1, 'hello', None]
>>> b.send_bytes('thank you')
>>> a.recv_bytes()
'thank you'
>>> import array
>>> arr1 = array.array('i', range(5))
>>> arr2 = array.array('i', [0] * 10)
>>> a.send_bytes(arr1)
>>> count = b.recv_bytes_into(arr2)
>>> assert count == len(arr1) * arr1.itemsize
>>> arr2
array('i', [0, 1, 2, 3, 4, 0, 0, 0, 0, 0])
.. warning::
The Connection.recv method automatically unpickles the data it
receives, which can be a security risk unless you can trust the process
which sent the message.
Therefore, unless the connection object was produced using Pipe you
should only use the Connection.recv and Connection.send
methods after performing some sort of authentication. See
multiprocessing-auth-keys.
.. warning::
If a process is killed while it is trying to read or write to a pipe then
the data in the pipe is likely to become corrupted, because it may become
impossible to be sure where the message boundaries lie.
Synchronization primitives
~~~~~~~~~~~~~~~~~~~~~~~~~~
Generally synchronization primitives are not as necessary in a multiprocess
program as they are in a multithreaded program. See the documentation for
threading (|py2stdlib-threading|) module.
Note that one can also create synchronization primitives by using a manager
object -- see multiprocessing-managers.
BoundedSemaphore([value])~
A bounded semaphore object: a clone of threading.BoundedSemaphore.
(On Mac OS X, this is indistinguishable from Semaphore because
``sem_getvalue()`` is not implemented on that platform).
Condition([lock])~
A condition variable: a clone of threading.Condition.
If {lock} is specified then it should be a Lock or RLock
object from multiprocessing (|py2stdlib-multiprocessing|).
Event()~
A clone of threading.Event.
This method returns the state of the internal semaphore on exit, so it
will always return ``True`` except if a timeout is given and the operation
times out.
.. versionchanged:: 2.7
Previously, the method always returned ``None``.
Lock()~
A non-recursive lock object: a clone of threading.Lock.
RLock()~
A recursive lock object: a clone of threading.RLock.
Semaphore([value])~
A bounded semaphore object: a clone of threading.Semaphore.
.. note::
The acquire method of BoundedSemaphore, Lock,
RLock and Semaphore has a timeout parameter not supported
by the equivalents in threading (|py2stdlib-threading|). The signature is
``acquire(block=True, timeout=None)`` with keyword parameters being
acceptable. If {block} is ``True`` and {timeout} is not ``None`` then it
specifies a timeout in seconds. If {block} is ``False`` then {timeout} is
ignored.
On Mac OS X, ``sem_timedwait`` is unsupported, so calling ``acquire()`` with
a timeout will emulate that function's behavior using a sleeping loop.
.. note::
If the SIGINT signal generated by Ctrl-C arrives while the main thread is
blocked by a call to BoundedSemaphore.acquire, Lock.acquire,
RLock.acquire, Semaphore.acquire, Condition.acquire
or Condition.wait then the call will be immediately interrupted and
KeyboardInterrupt will be raised.
This differs from the behaviour of threading (|py2stdlib-threading|) where SIGINT will be
ignored while the equivalent blocking calls are in progress.
Shared ctypes (|py2stdlib-ctypes|) Objects
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
It is possible to create shared objects using shared memory which can be
inherited by child processes.
Value(typecode_or_type, *args[, lock])~
Return a ctypes (|py2stdlib-ctypes|) object allocated from shared memory. By default the
return value is actually a synchronized wrapper for the object.
{typecode_or_type} determines the type of the returned object: it is either a
ctypes type or a one character typecode of the kind used by the array (|py2stdlib-array|)
module. {\}args* is passed on to the constructor for the type.
If {lock} is ``True`` (the default) then a new lock object is created to
synchronize access to the value. If {lock} is a Lock or
RLock object then that will be used to synchronize access to the
value. If {lock} is ``False`` then access to the returned object will not be
automatically protected by a lock, so it will not necessarily be
"process-safe".
Note that {lock} is a keyword-only argument.
Array(typecode_or_type, size_or_initializer, *, lock=True)~
Return a ctypes array allocated from shared memory. By default the return
value is actually a synchronized wrapper for the array.
{typecode_or_type} determines the type of the elements of the returned array:
it is either a ctypes type or a one character typecode of the kind used by
the array (|py2stdlib-array|) module. If {size_or_initializer} is an integer, then it
determines the length of the array, and the array will be initially zeroed.
Otherwise, {size_or_initializer} is a sequence which is used to initialize
the array and whose length determines the length of the array.
If {lock} is ``True`` (the default) then a new lock object is created to
synchronize access to the value. If {lock} is a Lock or
RLock object then that will be used to synchronize access to the
value. If {lock} is ``False`` then access to the returned object will not be
automatically protected by a lock, so it will not necessarily be
"process-safe".
Note that {lock} is a keyword only argument.
Note that an array of ctypes.c_char has {value} and {raw}
attributes which allow one to use it to store and retrieve strings.
The multiprocessing.sharedctypes (|py2stdlib-multiprocessing.sharedctypes|) module
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
==============================================================================
*py2stdlib-multiprocessing.sharedctypes*
multiprocessing.sharedctypes~
:synopsis: Allocate ctypes objects from shared memory.
The multiprocessing.sharedctypes (|py2stdlib-multiprocessing.sharedctypes|) module provides functions for allocating
ctypes (|py2stdlib-ctypes|) objects from shared memory which can be inherited by child
processes.
.. note::
Although it is possible to store a pointer in shared memory remember that
this will refer to a location in the address space of a specific process.
However, the pointer is quite likely to be invalid in the context of a second
process and trying to dereference the pointer from the second process may
cause a crash.
RawArray(typecode_or_type, size_or_initializer)~
Return a ctypes array allocated from shared memory.
{typecode_or_type} determines the type of the elements of the returned array:
it is either a ctypes type or a one character typecode of the kind used by
the array (|py2stdlib-array|) module. If {size_or_initializer} is an integer then it
determines the length of the array, and the array will be initially zeroed.
Otherwise {size_or_initializer} is a sequence which is used to initialize the
array and whose length determines the length of the array.
Note that setting and getting an element is potentially non-atomic -- use
Array instead to make sure that access is automatically synchronized
using a lock.
RawValue(typecode_or_type, *args)~
Return a ctypes object allocated from shared memory.
{typecode_or_type} determines the type of the returned object: it is either a
ctypes type or a one character typecode of the kind used by the array (|py2stdlib-array|)
module. {\}args* is passed on to the constructor for the type.
Note that setting and getting the value is potentially non-atomic -- use
Value instead to make sure that access is automatically synchronized
using a lock.
Note that an array of ctypes.c_char has ``value`` and ``raw``
attributes which allow one to use it to store and retrieve strings -- see
documentation for ctypes (|py2stdlib-ctypes|).
Array(typecode_or_type, size_or_initializer, *args[, lock])~
The same as RawArray except that depending on the value of {lock} a
process-safe synchronization wrapper may be returned instead of a raw ctypes
array.
If {lock} is ``True`` (the default) then a new lock object is created to
synchronize access to the value. If {lock} is a Lock or
RLock object then that will be used to synchronize access to the
value. If {lock} is ``False`` then access to the returned object will not be
automatically protected by a lock, so it will not necessarily be
"process-safe".
Note that {lock} is a keyword-only argument.
Value(typecode_or_type, *args[, lock])~
The same as RawValue except that depending on the value of {lock} a
process-safe synchronization wrapper may be returned instead of a raw ctypes
object.
If {lock} is ``True`` (the default) then a new lock object is created to
synchronize access to the value. If {lock} is a Lock or
RLock object then that will be used to synchronize access to the
value. If {lock} is ``False`` then access to the returned object will not be
automatically protected by a lock, so it will not necessarily be
"process-safe".
Note that {lock} is a keyword-only argument.
copy(obj)~
Return a ctypes object allocated from shared memory which is a copy of the
ctypes object {obj}.
synchronized(obj[, lock])~
Return a process-safe wrapper object for a ctypes object which uses {lock} to
synchronize access. If {lock} is ``None`` (the default) then a
multiprocessing.RLock object is created automatically.
A synchronized wrapper will have two methods in addition to those of the
object it wraps: get_obj returns the wrapped object and
get_lock returns the lock object used for synchronization.
Note that accessing the ctypes object through the wrapper can be a lot slower
than accessing the raw ctypes object.
The table below compares the syntax for creating shared ctypes objects from
shared memory with the normal ctypes syntax. (In the table ``MyStruct`` is some
subclass of ctypes.Structure.)
==================== ========================== ===========================
ctypes sharedctypes using type sharedctypes using typecode
==================== ========================== ===========================
c_double(2.4) RawValue(c_double, 2.4) RawValue('d', 2.4)
MyStruct(4, 6) RawValue(MyStruct, 4, 6)
(c_short * 7)() RawArray(c_short, 7) RawArray('h', 7)
(c_int * 3)(9, 2, 8) RawArray(c_int, (9, 2, 8)) RawArray('i', (9, 2, 8))
==================== ========================== ===========================
Below is an example where a number of ctypes objects are modified by a child
process:: >
from multiprocessing import Process, Lock
from multiprocessing.sharedctypes import Value, Array
from ctypes import Structure, c_double
class Point(Structure):
_fields_ = [('x', c_double), ('y', c_double)]
def modify(n, x, s, A):
n.value {}= 2
x.value {}= 2
s.value = s.value.upper()
for a in A:
a.x {}= 2
a.y {}= 2
if __name__ == '__main__':
lock = Lock()
n = Value('i', 7)
x = Value(c_double, 1.0/3.0, lock=False)
s = Array('c', 'hello world', lock=lock)
A = Array(Point, [(1.875,-6.25), (-5.75,2.0), (2.375,9.5)], lock=lock)
p = Process(target=modify, args=(n, x, s, A))
p.start()
p.join()
print n.value
print x.value
print s.value
print [(a.x, a.y) for a in A]
<
.. highlightlang:: none
The results printed are :: >
49
0.1111111111111111
HELLO WORLD
[(3.515625, 39.0625), (33.0625, 4.0), (5.640625, 90.25)]
<
.. highlightlang:: python
Managers
~~~~~~~~
Managers provide a way to create data which can be shared between different
processes. A manager object controls a server process which manages *shared
objects*. Other processes can access the shared objects by using proxies.
multiprocessing.Manager()~
Returns a started multiprocessing.managers.SyncManager object which
can be used for sharing objects between processes. The returned manager
object corresponds to a spawned child process and has methods which will
create shared objects and return corresponding proxies.
==============================================================================
*py2stdlib-multiprocessing.managers*
multiprocessing.managers~
:synopsis: Share data between process with shared objects.
Manager processes will be shutdown as soon as they are garbage collected or
their parent process exits. The manager classes are defined in the
multiprocessing.managers (|py2stdlib-multiprocessing.managers|) module:
BaseManager([address[, authkey]])~
Create a BaseManager object.
Once created one should call start or ``get_server().serve_forever()`` to ensure
that the manager object refers to a started manager process.
{address} is the address on which the manager process listens for new
connections. If {address} is ``None`` then an arbitrary one is chosen.
{authkey} is the authentication key which will be used to check the validity
of incoming connections to the server process. If {authkey} is ``None`` then
``current_process().authkey``. Otherwise {authkey} is used and it
must be a string.
start([initializer[, initargs]])~
Start a subprocess to start the manager. If {initializer} is not ``None``
then the subprocess will call ``initializer(*initargs)`` when it starts.
get_server()~
Returns a Server object which represents the actual server under
the control of the Manager. The Server object supports the
serve_forever method:: >
<
>>> from multiprocessing.managers import BaseManager
>>> manager = BaseManager(address=('', 50000), authkey='abc')
>>> server = manager.get_server()
>>> server.serve_forever()
Server additionally has an address attribute.
connect()~
Connect a local manager object to a remote manager process:: >
<
>>> from multiprocessing.managers import BaseManager
>>> m = BaseManager(address=('127.0.0.1', 5000), authkey='abc')
>>> m.connect()
shutdown()~
Stop the process used by the manager. This is only available if
start has been used to start the server process.
This can be called multiple times.
register(typeid[, callable[, proxytype[, exposed[, method_to_typeid[, create_method]]]]])~
A classmethod which can be used for registering a type or callable with
the manager class.
{typeid} is a "type identifier" which is used to identify a particular
type of shared object. This must be a string.
{callable} is a callable used for creating objects for this type
identifier. If a manager instance will be created using the
from_address classmethod or if the {create_method} argument is
``False`` then this can be left as ``None``.
{proxytype} is a subclass of BaseProxy which is used to create
proxies for shared objects with this {typeid}. If ``None`` then a proxy
class is created automatically.
{exposed} is used to specify a sequence of method names which proxies for
this typeid should be allowed to access using
BaseProxy._callMethod. (If {exposed} is ``None`` then
proxytype._exposed_ is used instead if it exists.) In the case
where no exposed list is specified, all "public methods" of the shared
object will be accessible. (Here a "public method" means any attribute
which has a __call__ method and whose name does not begin with
``'_'``.)
{method_to_typeid} is a mapping used to specify the return type of those
exposed methods which should return a proxy. It maps method names to
typeid strings. (If {method_to_typeid} is ``None`` then
proxytype._method_to_typeid_ is used instead if it exists.) If a
method's name is not a key of this mapping or if the mapping is ``None``
then the object returned by the method will be copied by value.
{create_method} determines whether a method should be created with name
{typeid} which can be used to tell the server process to create a new
shared object and return a proxy for it. By default it is ``True``.
BaseManager instances also have one read-only property:
address~
The address used by the manager.
SyncManager~
A subclass of BaseManager which can be used for the synchronization
of processes. Objects of this type are returned by
multiprocessing.Manager.
It also supports creation of shared lists and dictionaries.
BoundedSemaphore([value])~
Create a shared threading.BoundedSemaphore object and return a
proxy for it.
Condition([lock])~
Create a shared threading.Condition object and return a proxy for
it.
If {lock} is supplied then it should be a proxy for a
threading.Lock or threading.RLock object.
Event()~
Create a shared threading.Event object and return a proxy for it.
Lock()~
Create a shared threading.Lock object and return a proxy for it.
Namespace()~
Create a shared Namespace object and return a proxy for it.
Queue([maxsize])~
Create a shared Queue.Queue object and return a proxy for it.
RLock()~
Create a shared threading.RLock object and return a proxy for it.
Semaphore([value])~
Create a shared threading.Semaphore object and return a proxy for
it.
Array(typecode, sequence)~
Create an array and return a proxy for it.
Value(typecode, value)~
Create an object with a writable ``value`` attribute and return a proxy
for it.
dict()~
dict(mapping)
dict(sequence)
Create a shared ``dict`` object and return a proxy for it.
list()~
list(sequence)
Create a shared ``list`` object and return a proxy for it.
Namespace objects
>>>>>>>>>>>>>>>>>
A namespace object has no public methods, but does have writable attributes.
Its representation shows the values of its attributes.
However, when using a proxy for a namespace object, an attribute beginning with
``'_'`` will be an attribute of the proxy and not an attribute of the referent:
.. doctest::
>>> manager = multiprocessing.Manager()
>>> Global = manager.Namespace()
>>> Global.x = 10
>>> Global.y = 'hello'
>>> Global._z = 12.3 # this is an attribute of the proxy
>>> print Global
Namespace(x=10, y='hello')
Customized managers
>>>>>>>>>>>>>>>>>>>
To create one's own manager, one creates a subclass of BaseManager and
use the BaseManager.register classmethod to register new types or
callables with the manager class. For example:: >
from multiprocessing.managers import BaseManager
class MathsClass(object):
def add(self, x, y):
return x + y
def mul(self, x, y):
return x * y
class MyManager(BaseManager):
pass
MyManager.register('Maths', MathsClass)
if __name__ == '__main__':
manager = MyManager()
manager.start()
maths = manager.Maths()
print maths.add(4, 3) # prints 7
print maths.mul(7, 8) # prints 56
<
Using a remote manager
>>>>>>>>>>>>>>>>>>>>>>
It is possible to run a manager server on one machine and have clients use it
from other machines (assuming that the firewalls involved allow it).
Running the following commands creates a server for a single shared queue which
remote clients can access:: >
>>> from multiprocessing.managers import BaseManager
>>> import Queue
>>> queue = Queue.Queue()
>>> class QueueManager(BaseManager): pass
>>> QueueManager.register('get_queue', callable=lambda:queue)
>>> m = QueueManager(address=('', 50000), authkey='abracadabra')
>>> s = m.get_server()
>>> s.serve_forever()
<
One client can access the server as follows::
>>> from multiprocessing.managers import BaseManager
>>> class QueueManager(BaseManager): pass
>>> QueueManager.register('get_queue')
>>> m = QueueManager(address=('foo.bar.org', 50000), authkey='abracadabra')
>>> m.connect()
>>> queue = m.get_queue()
>>> queue.put('hello')
Another client can also use it:: >
>>> from multiprocessing.managers import BaseManager
>>> class QueueManager(BaseManager): pass
>>> QueueManager.register('get_queue')
>>> m = QueueManager(address=('foo.bar.org', 50000), authkey='abracadabra')
>>> m.connect()
>>> queue = m.get_queue()
>>> queue.get()
'hello'
<
Local processes can also access that queue, using the code from above on the
client to access it remotely:: >
>>> from multiprocessing import Process, Queue
>>> from multiprocessing.managers import BaseManager
>>> class Worker(Process):
... def __init__(self, q):
... self.q = q
... super(Worker, self).__init__()
... def run(self):
... self.q.put('local hello')
...
>>> queue = Queue()
>>> w = Worker(queue)
>>> w.start()
>>> class QueueManager(BaseManager): pass
...
>>> QueueManager.register('get_queue', callable=lambda: queue)
>>> m = QueueManager(address=('', 50000), authkey='abracadabra')
>>> s = m.get_server()
>>> s.serve_forever()
<
Proxy Objects
A proxy is an object which {refers} to a shared object which lives (presumably)
in a different process. The shared object is said to be the {referent} of the
proxy. Multiple proxy objects may have the same referent.
A proxy object has methods which invoke corresponding methods of its referent
(although not every method of the referent will necessarily be available through
the proxy). A proxy can usually be used in most of the same ways that its
referent can:
.. doctest::
>>> from multiprocessing import Manager
>>> manager = Manager()
>>> l = manager.list([i*i for i in range(10)])
>>> print l
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
>>> print repr(l)
<ListProxy object, typeid 'list' at 0x...>
>>> l[4]
16
>>> l[2:5]
[4, 9, 16]
Notice that applying str to a proxy will return the representation of
the referent, whereas applying repr (|py2stdlib-repr|) will return the representation of
the proxy.
An important feature of proxy objects is that they are picklable so they can be
passed between processes. Note, however, that if a proxy is sent to the
corresponding manager's process then unpickling it will produce the referent
itself. This means, for example, that one shared object can contain a second:
.. doctest::
>>> a = manager.list()
>>> b = manager.list()
>>> a.append(b) # referent of a now contains referent of b
>>> print a, b
[[]] []
>>> b.append('hello')
>>> print a, b
[['hello']] ['hello']
.. note::
The proxy types in multiprocessing (|py2stdlib-multiprocessing|) do nothing to support comparisons
by value. So, for instance, we have:
.. doctest:: >
>>> manager.list([1,2,3]) == [1,2,3]
False
<
One should just use a copy of the referent instead when making comparisons.
BaseProxy~
Proxy objects are instances of subclasses of BaseProxy.
_callmethod(methodname[, args[, kwds]])~
Call and return the result of a method of the proxy's referent.
If ``proxy`` is a proxy whose referent is ``obj`` then the expression :: >
proxy._callmethod(methodname, args, kwds)
<
will evaluate the expression ::
getattr(obj, methodname)({args, }*kwds)
in the manager's process.
The returned value will be a copy of the result of the call or a proxy to
a new shared object -- see documentation for the {method_to_typeid}
argument of BaseManager.register.
If an exception is raised by the call, then then is re-raised by
_callmethod. If some other exception is raised in the manager's
process then this is converted into a RemoteError exception and is
raised by _callmethod.
Note in particular that an exception will be raised if {methodname} has
not been {exposed}
An example of the usage of _callmethod:
.. doctest:: >
>>> l = manager.list(range(10))
>>> l._callmethod('__len__')
10
>>> l._callmethod('__getslice__', (2, 7)) # equiv to `l[2:7]`
[2, 3, 4, 5, 6]
>>> l._callmethod('__getitem__', (20,)) # equiv to `l[20]`
Traceback (most recent call last):
...
IndexError: list index out of range
<
_getvalue()~
Return a copy of the referent.
If the referent is unpicklable then this will raise an exception.
__repr__~
Return a representation of the proxy object.
__str__~
Return the representation of the referent.
Cleanup
>>>>>>>
A proxy object uses a weakref callback so that when it gets garbage collected it
deregisters itself from the manager which owns its referent.
A shared object gets deleted from the manager process when there are no longer
any proxies referring to it.
Process Pools
~~~~~~~~~~~~~
==============================================================================
*py2stdlib-multiprocessing.pool*
multiprocessing.pool~
:synopsis: Create pools of processes.
One can create a pool of processes which will carry out tasks submitted to it
with the Pool class.
multiprocessing.Pool([processes[, initializer[, initargs[, maxtasksperchild]]]])~
A process pool object which controls a pool of worker processes to which jobs
can be submitted. It supports asynchronous results with timeouts and
callbacks and has a parallel map implementation.
{processes} is the number of worker processes to use. If {processes} is
``None`` then the number returned by cpu_count is used. If
{initializer} is not ``None`` then each worker process will call
``initializer(*initargs)`` when it starts.
{maxtasksperchild} is the number of tasks a worker process can complete
before it will exit and be replaced with a fresh worker process, to enable
unused resources to be freed. The default {maxtasksperchild} is None, which
means worker processes will live as long as the pool.
.. note:: >
Worker processes within a Pool typically live for the complete
duration of the Pool's work queue. A frequent pattern found in other
systems (such as Apache, mod_wsgi, etc) to free resources held by
workers is to allow a worker within a pool to complete only a set
amount of work before being exiting, being cleaned up and a new
process spawned to replace the old one. The {maxtasksperchild}
argument to the Pool exposes this ability to the end user.
<
apply(func[, args[, kwds]])~
Equivalent of the apply built-in function. It blocks till the
result is ready. Given this blocks, apply_async is better suited
for performing work in parallel. Additionally, the passed
in function is only executed in one of the workers of the pool.
apply_async(func[, args[, kwds[, callback]]])~
A variant of the apply method which returns a result object.
If {callback} is specified then it should be a callable which accepts a
single argument. When the result becomes ready {callback} is applied to
it (unless the call failed). {callback} should complete immediately since
otherwise the thread which handles the results will get blocked.
map(func, iterable[, chunksize])~
A parallel equivalent of the map built-in function (it supports only
one {iterable} argument though). It blocks till the result is ready.
This method chops the iterable into a number of chunks which it submits to
the process pool as separate tasks. The (approximate) size of these
chunks can be specified by setting {chunksize} to a positive integer.
map_async(func, iterable[, chunksize[, callback]])~
A variant of the .map method which returns a result object.
If {callback} is specified then it should be a callable which accepts a
single argument. When the result becomes ready {callback} is applied to
it (unless the call failed). {callback} should complete immediately since
otherwise the thread which handles the results will get blocked.
imap(func, iterable[, chunksize])~
An equivalent of itertools.imap.
The {chunksize} argument is the same as the one used by the .map
method. For very long iterables using a large value for {chunksize} can
make make the job complete {much}* faster than using the default value of
``1``.
Also if {chunksize} is ``1`` then the !next method of the iterator
returned by the imap method has an optional {timeout} parameter:
``next(timeout)`` will raise multiprocessing.TimeoutError if the
result cannot be returned within {timeout} seconds.
imap_unordered(func, iterable[, chunksize])~
The same as imap except that the ordering of the results from the
returned iterator should be considered arbitrary. (Only when there is
only one worker process is the order guaranteed to be "correct".)
close()~
Prevents any more tasks from being submitted to the pool. Once all the
tasks have been completed the worker processes will exit.
terminate()~
Stops the worker processes immediately without completing outstanding
work. When the pool object is garbage collected terminate will be
called immediately.
join()~
Wait for the worker processes to exit. One must call close or
terminate before using join.
AsyncResult~
The class of the result returned by Pool.apply_async and
Pool.map_async.
get([timeout])~
Return the result when it arrives. If {timeout} is not ``None`` and the
result does not arrive within {timeout} seconds then
multiprocessing.TimeoutError is raised. If the remote call raised
an exception then that exception will be reraised by get.
wait([timeout])~
Wait until the result is available or until {timeout} seconds pass.
ready()~
Return whether the call has completed.
successful()~
Return whether the call completed without raising an exception. Will
raise AssertionError if the result is not ready.
The following example demonstrates the use of a pool:: >
from multiprocessing import Pool
def f(x):
return x*x
if __name__ == '__main__':
pool = Pool(processes=4) # start 4 worker processes
result = pool.apply_async(f, (10,)) # evaluate "f(10)" asynchronously
print result.get(timeout=1) # prints "100" unless your computer is {very} slow
print pool.map(f, range(10)) # prints "[0, 1, 4,..., 81]"
it = pool.imap(f, range(10))
print it.next() # prints "0"
print it.next() # prints "1"
print it.next(timeout=1) # prints "4" unless your computer is {very} slow
import time
result = pool.apply_async(time.sleep, (10,))
print result.get(timeout=1) # raises TimeoutError
<
Listeners and Clients
==============================================================================
*py2stdlib-multiprocessing.connection*
multiprocessing.connection~
:synopsis: API for dealing with sockets.
Usually message passing between processes is done using queues or by using
Connection objects returned by Pipe.
However, the multiprocessing.connection (|py2stdlib-multiprocessing.connection|) module allows some extra
flexibility. It basically gives a high level message oriented API for dealing
with sockets or Windows named pipes, and also has support for *digest
authentication* using the hmac (|py2stdlib-hmac|) module.
deliver_challenge(connection, authkey)~
Send a randomly generated message to the other end of the connection and wait
for a reply.
If the reply matches the digest of the message using {authkey} as the key
then a welcome message is sent to the other end of the connection. Otherwise
AuthenticationError is raised.
answerChallenge(connection, authkey)~
Receive a message, calculate the digest of the message using {authkey} as the
key, and then send the digest back.
If a welcome message is not received, then AuthenticationError is
raised.
Client(address[, family[, authenticate[, authkey]]])~
Attempt to set up a connection to the listener which is using address
{address}, returning a multiprocessing.Connection.
The type of the connection is determined by {family} argument, but this can
generally be omitted since it can usually be inferred from the format of
{address}. (See multiprocessing-address-formats)
If {authenticate} is ``True`` or {authkey} is a string then digest
authentication is used. The key used for authentication will be either
{authkey} or ``current_process().authkey)`` if {authkey} is ``None``.
If authentication fails then AuthenticationError is raised. See
multiprocessing-auth-keys.
Listener([address[, family[, backlog[, authenticate[, authkey]]]]])~
A wrapper for a bound socket or Windows named pipe which is 'listening' for
connections.
{address} is the address to be used by the bound socket or named pipe of the
listener object.
.. note:: >
If an address of '0.0.0.0' is used, the address will not be a connectable
end point on Windows. If you require a connectable end-point,
you should use '127.0.0.1'.
<
{family} is the type of socket (or named pipe) to use. This can be one of
the strings ``'AF_INET'`` (for a TCP socket), ``'AF_UNIX'`` (for a Unix
domain socket) or ``'AF_PIPE'`` (for a Windows named pipe). Of these only
the first is guaranteed to be available. If {family} is ``None`` then the
family is inferred from the format of {address}. If {address} is also
``None`` then a default is chosen. This default is the family which is
assumed to be the fastest available. See
multiprocessing-address-formats. Note that if {family} is
``'AF_UNIX'`` and address is ``None`` then the socket will be created in a
private temporary directory created using tempfile.mkstemp.
If the listener object uses a socket then {backlog} (1 by default) is passed
to the listen method of the socket once it has been bound.
If {authenticate} is ``True`` (``False`` by default) or {authkey} is not
``None`` then digest authentication is used.
If {authkey} is a string then it will be used as the authentication key;
otherwise it must be {None}.
If {authkey} is ``None`` and {authenticate} is ``True`` then
``current_process().authkey`` is used as the authentication key. If
{authkey} is ``None`` and {authenticate} is ``False`` then no
authentication is done. If authentication fails then
AuthenticationError is raised. See multiprocessing-auth-keys.
accept()~
Accept a connection on the bound socket or named pipe of the listener
object and return a Connection object. If authentication is
attempted and fails, then AuthenticationError is raised.
close()~
Close the bound socket or named pipe of the listener object. This is
called automatically when the listener is garbage collected. However it
is advisable to call it explicitly.
Listener objects have the following read-only properties:
address~
The address which is being used by the Listener object.
last_accepted~
The address from which the last accepted connection came. If this is
unavailable then it is ``None``.
The module defines two exceptions:
AuthenticationError~
Exception raised when there is an authentication error.
{Examples}*
The following server code creates a listener which uses ``'secret password'`` as
an authentication key. It then waits for a connection and sends some data to
the client:: >
from multiprocessing.connection import Listener
from array import array
address = ('localhost', 6000) # family is deduced to be 'AF_INET'
listener = Listener(address, authkey='secret password')
conn = listener.accept()
print 'connection accepted from', listener.last_accepted
conn.send([2.25, None, 'junk', float])
conn.send_bytes('hello')
conn.send_bytes(array('i', [42, 1729]))
conn.close()
listener.close()
<
The following code connects to the server and receives some data from the
server:: >
from multiprocessing.connection import Client
from array import array
address = ('localhost', 6000)
conn = Client(address, authkey='secret password')
print conn.recv() # => [2.25, None, 'junk', float]
print conn.recv_bytes() # => 'hello'
arr = array('i', [0, 0, 0, 0, 0])
print conn.recv_bytes_into(arr) # => 8
print arr # => array('i', [42, 1729, 0, 0, 0])
conn.close()
<
Address Formats
>>>>>>>>>>>>>>>
* An ``'AF_INET'`` address is a tuple of the form ``(hostname, port)`` where
{hostname} is a string and {port} is an integer.
* An ``'AF_UNIX'`` address is a string representing a filename on the
filesystem.
* An ``'AF_PIPE'`` address is a string of the form
r'\\\\.\\pipe\\{PipeName}'. To use Client to connect to a named
pipe on a remote computer called {ServerName} one should use an address of the
form r'\\\\{ServerName}\\pipe\\{PipeName}' instead.
Note that any string beginning with two backslashes is assumed by default to be
an ``'AF_PIPE'`` address rather than an ``'AF_UNIX'`` address.
Authentication keys
~~~~~~~~~~~~~~~~~~~
When one uses Connection.recv, the data received is automatically
unpickled. Unfortunately unpickling data from an untrusted source is a security
risk. Therefore Listener and Client use the hmac (|py2stdlib-hmac|) module
to provide digest authentication.
An authentication key is a string which can be thought of as a password: once a
connection is established both ends will demand proof that the other knows the
authentication key. (Demonstrating that both ends are using the same key does
{not}* involve sending the key over the connection.)
If authentication is requested but do authentication key is specified then the
return value of ``current_process().authkey`` is used (see
multiprocessing.Process). This value will automatically inherited by
any multiprocessing.Process object that the current process creates.
This means that (by default) all processes of a multi-process program will share
a single authentication key which can be used when setting up connections
between themselves.
Suitable authentication keys can also be generated by using os.urandom.
Logging
~~~~~~~
Some support for logging is available. Note, however, that the logging (|py2stdlib-logging|)
package does not use process shared locks so it is possible (depending on the
handler type) for messages from different processes to get mixed up.
.. currentmodule:: multiprocessing
get_logger()~
Returns the logger used by multiprocessing (|py2stdlib-multiprocessing|). If necessary, a new one
will be created.
When first created the logger has level logging.NOTSET and no
default handler. Messages sent to this logger will not by default propagate
to the root logger.
Note that on Windows child processes will only inherit the level of the
parent process's logger -- any other customization of the logger will not be
inherited.
.. currentmodule:: multiprocessing
log_to_stderr()~
This function performs a call to get_logger but in addition to
returning the logger created by get_logger, it adds a handler which sends
output to sys.stderr using format
``'[%(levelname)s/%(processName)s] %(message)s'``.
Below is an example session with logging turned on:: >
>>> import multiprocessing, logging
>>> logger = multiprocessing.log_to_stderr()
>>> logger.setLevel(logging.INFO)
>>> logger.warning('doomed')
[WARNING/MainProcess] doomed
>>> m = multiprocessing.Manager()
[INFO/SyncManager-...] child process calling self.run()
[INFO/SyncManager-...] created temp directory /.../pymp-...
[INFO/SyncManager-...] manager serving at '/.../listener-...'
>>> del m
[INFO/MainProcess] sending shutdown message to manager
[INFO/SyncManager-...] manager exiting with exitcode 0
<
In addition to having these two logging functions, the multiprocessing also
exposes two additional logging level attributes. These are SUBWARNING
and SUBDEBUG. The table below illustrates where theses fit in the
normal level hierarchy.
+----------------+----------------+
| Level | Numeric value |
+================+================+
| ``SUBWARNING`` | 25 |
+----------------+----------------+
| ``SUBDEBUG`` | 5 |
+----------------+----------------+
For a full table of logging levels, see the logging (|py2stdlib-logging|) module.
These additional logging levels are used primarily for certain debug messages
within the multiprocessing module. Below is the same example as above, except
with SUBDEBUG enabled:: >
>>> import multiprocessing, logging
>>> logger = multiprocessing.log_to_stderr()
>>> logger.setLevel(multiprocessing.SUBDEBUG)
>>> logger.warning('doomed')
[WARNING/MainProcess] doomed
>>> m = multiprocessing.Manager()
[INFO/SyncManager-...] child process calling self.run()
[INFO/SyncManager-...] created temp directory /.../pymp-...
[INFO/SyncManager-...] manager serving at '/.../pymp-djGBXN/listener-...'
>>> del m
[SUBDEBUG/MainProcess] finalizer calling ...
[INFO/MainProcess] sending shutdown message to manager
[DEBUG/SyncManager-...] manager received shutdown message
[SUBDEBUG/SyncManager-...] calling <Finalize object, callback=unlink, ...
[SUBDEBUG/SyncManager-...] finalizer calling <built-in function unlink> ...
[SUBDEBUG/SyncManager-...] calling <Finalize object, dead>
[SUBDEBUG/SyncManager-...] finalizer calling <function rmtree at 0x5aa730> ...
[INFO/SyncManager-...] manager exiting with exitcode 0
<
The multiprocessing.dummy (|py2stdlib-multiprocessing.dummy|) module
==============================================================================
*py2stdlib-multiprocessing.dummy*
multiprocessing.dummy~
:synopsis: Dumb wrapper around threading.
multiprocessing.dummy (|py2stdlib-multiprocessing.dummy|) replicates the API of multiprocessing (|py2stdlib-multiprocessing|) but is
no more than a wrapper around the threading (|py2stdlib-threading|) module.
Programming guidelines
----------------------
There are certain guidelines and idioms which should be adhered to when using
multiprocessing (|py2stdlib-multiprocessing|).
All platforms
~~~~~~~~~~~~~
Avoid shared state
As far as possible one should try to avoid shifting large amounts of data
between processes.
It is probably best to stick to using queues or pipes for communication
between processes rather than using the lower level synchronization
primitives from the threading (|py2stdlib-threading|) module.
Picklability
Ensure that the arguments to the methods of proxies are picklable.
Thread safety of proxies
Do not use a proxy object from more than one thread unless you protect it
with a lock.
(There is never a problem with different processes using the {same} proxy.)
Joining zombie processes
On Unix when a process finishes but has not been joined it becomes a zombie.
There should never be very many because each time a new process starts (or
active_children is called) all completed processes which have not
yet been joined will be joined. Also calling a finished process's
Process.is_alive will join the process. Even so it is probably good
practice to explicitly join all the processes that you start.
Better to inherit than pickle/unpickle
On Windows many types from multiprocessing (|py2stdlib-multiprocessing|) need to be picklable so
that child processes can use them. However, one should generally avoid
sending shared objects to other processes using pipes or queues. Instead
you should arrange the program so that a process which need access to a
shared resource created elsewhere can inherit it from an ancestor process.
Avoid terminating processes
Using the Process.terminate method to stop a process is liable to
cause any shared resources (such as locks, semaphores, pipes and queues)
currently being used by the process to become broken or unavailable to other
processes.
Therefore it is probably best to only consider using
Process.terminate on processes which never use any shared resources.
Joining processes that use queues
Bear in mind that a process that has put items in a queue will wait before
terminating until all the buffered items are fed by the "feeder" thread to
the underlying pipe. (The child process can call the
Queue.cancel_join_thread method of the queue to avoid this behaviour.)
This means that whenever you use a queue you need to make sure that all
items which have been put on the queue will eventually be removed before the
process is joined. Otherwise you cannot be sure that processes which have
put items on the queue will terminate. Remember also that non-daemonic
processes will be automatically be joined.
An example which will deadlock is the following:: >
from multiprocessing import Process, Queue
def f(q):
q.put('X' * 1000000)
if __name__ == '__main__':
queue = Queue()
p = Process(target=f, args=(queue,))
p.start()
p.join() # this deadlocks
obj = queue.get()
<
A fix here would be to swap the last two lines round (or simply remove the
``p.join()`` line).
Explicitly pass resources to child processes
On Unix a child process can make use of a shared resource created in a
parent process using a global resource. However, it is better to pass the
object as an argument to the constructor for the child process.
Apart from making the code (potentially) compatible with Windows this also
ensures that as long as the child process is still alive the object will not
be garbage collected in the parent process. This might be important if some
resource is freed when the object is garbage collected in the parent
process.
So for instance :: >
from multiprocessing import Process, Lock
def f():
... do something using "lock" ...
if __name__ == '__main__':
lock = Lock()
for i in range(10):
Process(target=f).start()
<
should be rewritten as ::
from multiprocessing import Process, Lock
def f(l):
... do something using "l" ...
if __name__ == '__main__':
lock = Lock()
for i in range(10):
Process(target=f, args=(lock,)).start()
Beware replacing sys.stdin with a "file like object"
multiprocessing (|py2stdlib-multiprocessing|) originally unconditionally called:: >
os.close(sys.stdin.fileno())
<
in the multiprocessing.Process._bootstrap method --- this resulted
in issues with processes-in-processes. This has been changed to:: >
sys.stdin.close()
sys.stdin = open(os.devnull)
<
Which solves the fundamental issue of processes colliding with each other
resulting in a bad file descriptor error, but introduces a potential danger
to applications which replace sys.stdin with a "file-like object"
with output buffering. This danger is that if multiple processes call
close() on this file-like object, it could result in the same
data being flushed to the object multiple times, resulting in corruption.
If you write a file-like object and implement your own caching, you can
make it fork-safe by storing the pid whenever you append to the cache,
and discarding the cache when the pid changes. For example:: >
@property
def cache(self):
pid = os.getpid()
if pid != self._pid:
self._pid = pid
self._cache = []
return self._cache
<
For more information, see 5155, 5313 and 5331
Windows
~~~~~~~
Since Windows lacks os.fork it has a few extra restrictions:
More picklability
Ensure that all arguments to Process.__init__ are picklable. This
means, in particular, that bound or unbound methods cannot be used directly
as the ``target`` argument on Windows --- just define a function and use
that instead.
Also, if you subclass Process then make sure that instances will be
picklable when the Process.start method is called.
Global variables
Bear in mind that if code run in a child process tries to access a global
variable, then the value it sees (if any) may not be the same as the value
in the parent process at the time that Process.start was called.
However, global variables which are just module level constants cause no
problems.
Safe importing of main module
Make sure that the main module can be safely imported by a new Python
interpreter without causing unintended side effects (such a starting a new
process).
For example, under Windows running the following module would fail with a
RuntimeError:: >
from multiprocessing import Process
def foo():
print 'hello'
p = Process(target=foo)
p.start()
<
Instead one should protect the "entry point" of the program by using ``if
__name__ == '__main__':`` as follows:: >
from multiprocessing import Process, freeze_support
def foo():
print 'hello'
if __name__ == '__main__':
freeze_support()
p = Process(target=foo)
p.start()
<
(The ``freeze_support()`` line can be omitted if the program will be run
normally instead of frozen.)
This allows the newly spawned Python interpreter to safely import the module
and then run the module's ``foo()`` function.
Similar restrictions apply if a pool or manager is created in the main
module.
Examples
--------
Demonstration of how to create and use customized managers and proxies:
.. literalinclude:: ../includes/mp_newtype.py
Using Pool:
.. literalinclude:: ../includes/mp_pool.py
Synchronization types like locks, conditions and queues:
.. literalinclude:: ../includes/mp_synchronize.py
An showing how to use queues to feed tasks to a collection of worker process and
collect the results:
.. literalinclude:: ../includes/mp_workers.py
An example of how a pool of worker processes can each run a
SimpleHTTPServer.HttpServer instance while sharing a single listening
socket.
.. literalinclude:: ../includes/mp_webserver.py
Some simple benchmarks comparing multiprocessing (|py2stdlib-multiprocessing|) with threading (|py2stdlib-threading|):
.. literalinclude:: ../includes/mp_benchmarks.py
==============================================================================
*py2stdlib-mutex*
mutex~
:synopsis: Lock and queue for mutual exclusion.
:deprecated:
2.6~
The mutex (|py2stdlib-mutex|) module has been removed in Python 3.0.
The mutex (|py2stdlib-mutex|) module defines a class that allows mutual-exclusion via
acquiring and releasing locks. It does not require (or imply)
threading (|py2stdlib-threading|) or multi-tasking, though it could be useful for those
purposes.
The mutex (|py2stdlib-mutex|) module defines the following class:
mutex()~
Create a new (unlocked) mutex.
A mutex has two pieces of state --- a "locked" bit and a queue. When the mutex
is not locked, the queue is empty. Otherwise, the queue contains zero or more
``(function, argument)`` pairs representing functions (or methods) waiting to
acquire the lock. When the mutex is unlocked while the queue is not empty, the
first queue entry is removed and its ``function(argument)`` pair called,
implying it now has the lock.
Of course, no multi-threading is implied -- hence the funny interface for
lock, where a function is called once the lock is acquired.
Mutex Objects
-------------
mutex (|py2stdlib-mutex|) objects have following methods:
mutex.test()~
Check whether the mutex is locked.
mutex.testandset()~
"Atomic" test-and-set, grab the lock if it is not set, and return ``True``,
otherwise, return ``False``.
mutex.lock(function, argument)~
Execute ``function(argument)``, unless the mutex is locked. In the case it is
locked, place the function and argument on the queue. See unlock for
explanation of when ``function(argument)`` is executed in that case.
mutex.unlock()~
Unlock the mutex if queue is empty, otherwise execute the first element in the
queue.
==============================================================================
*py2stdlib-macerrors*
macerrors~
:platform: Mac
:synopsis: Constant definitions for many Mac OS error codes.
:deprecated:
macerrors (|py2stdlib-macerrors|) contains constant definitions for many Mac OS error codes.
2.6~
macresource (|py2stdlib-macresource|) --- Locate script resources
----------------------------------------------
==============================================================================
*py2stdlib-macresource*
macresource~
:platform: Mac
:synopsis: Locate script resources.
:deprecated:
macresource (|py2stdlib-macresource|) helps scripts finding their resources, such as dialogs and
menus, without requiring special case code for when the script is run under
MacPython, as a MacPython applet or under OSX Python.
2.6~
Nav (|py2stdlib-nav|) --- NavServices calls
--------------------------------
==============================================================================
*py2stdlib-netrc*
netrc~
:synopsis: Loading of .netrc files.
.. versionadded:: 1.5.2
The netrc (|py2stdlib-netrc|) class parses and encapsulates the netrc file format used by
the Unix ftp program and other FTP clients.
netrc([file])~
A netrc (|py2stdlib-netrc|) instance or subclass instance encapsulates data from a netrc
file. The initialization argument, if present, specifies the file to parse. If
no argument is given, the file .netrc in the user's home directory will
be read. Parse errors will raise NetrcParseError with diagnostic
information including the file name, line number, and terminating token.
NetrcParseError~
Exception raised by the netrc (|py2stdlib-netrc|) class when syntactical errors are
encountered in source text. Instances of this exception provide three
interesting attributes: msg is a textual explanation of the error,
filename is the name of the source file, and lineno gives the
line number on which the error was found.
netrc Objects
-------------
A netrc (|py2stdlib-netrc|) instance has the following methods:
netrc.authenticators(host)~
Return a 3-tuple ``(login, account, password)`` of authenticators for {host}.
If the netrc file did not contain an entry for the given host, return the tuple
associated with the 'default' entry. If neither matching host nor default entry
is available, return ``None``.
netrc.__repr__()~
Dump the class data as a string in the format of a netrc file. (This discards
comments and may reorder the entries.)
Instances of netrc (|py2stdlib-netrc|) have public instance variables:
netrc.hosts~
Dictionary mapping host names to ``(login, account, password)`` tuples. The
'default' entry, if any, is represented as a pseudo-host by that name.
netrc.macros~
Dictionary mapping macro names to string lists.
.. note::
Passwords are limited to a subset of the ASCII character set. Versions of
this module prior to 2.3 were extremely limited. Starting with 2.3, all
ASCII punctuation is allowed in passwords. However, note that whitespace and
non-printable characters are not allowed in passwords. This is a limitation
of the way the .netrc file is parsed and may be removed in the future.
==============================================================================
*py2stdlib-new*
new~
:synopsis: Interface to the creation of runtime implementation objects.
:deprecated:
2.6~
The new (|py2stdlib-new|) module has been removed in Python 3.0. Use the types (|py2stdlib-types|)
module's classes instead.
The new (|py2stdlib-new|) module allows an interface to the interpreter object creation
functions. This is for use primarily in marshal-type functions, when a new
object needs to be created "magically" and not by using the regular creation
functions. This module provides a low-level interface to the interpreter, so
care must be exercised when using this module. It is possible to supply
non-sensical arguments which crash the interpreter when the object is used.
The new (|py2stdlib-new|) module defines the following functions:
instance(class[, dict])~
This function creates an instance of {class} with dictionary {dict} without
calling the __init__ constructor. If {dict} is omitted or ``None``, a
new, empty dictionary is created for the new instance. Note that there are no
guarantees that the object will be in a consistent state.
instancemethod(function, instance, class)~
This function will return a method object, bound to {instance}, or unbound if
{instance} is ``None``. {function} must be callable.
function(code, globals[, name[, argdefs[, closure]]])~
Returns a (Python) function with the given code and globals. If {name} is given,
it must be a string or ``None``. If it is a string, the function will have the
given name, otherwise the function name will be taken from ``code.co_name``. If
{argdefs} is given, it must be a tuple and will be used to determine the default
values of parameters. If {closure} is given, it must be ``None`` or a tuple of
cell objects containing objects to bind to the names in ``code.co_freevars``.
code(argcount, nlocals, stacksize, flags, codestring, constants, names, varnames, filename, name, firstlineno, lnotab)~
This function is an interface to the PyCode_New C function.
.. XXX This is still undocumented!
module(name[, doc])~
This function returns a new module object with name {name}. {name} must be a
string. The optional {doc} argument can have any type.
classobj(name, baseclasses, dict)~
This function returns a new class object, with name {name}, derived from
{baseclasses} (which should be a tuple of classes) and with namespace {dict}.
==============================================================================
*py2stdlib-nis*
nis~
:platform: Unix
:synopsis: Interface to Sun's NIS (Yellow Pages) library.
The nis (|py2stdlib-nis|) module gives a thin wrapper around the NIS library, useful for
central administration of several hosts.
Because NIS exists only on Unix systems, this module is only available for Unix.
The nis (|py2stdlib-nis|) module defines the following functions:
match(key, mapname[, domain=default_domain])~
Return the match for {key} in map {mapname}, or raise an error
(nis.error) if there is none. Both should be strings, {key} is 8-bit
clean. Return value is an arbitrary array of bytes (may contain ``NULL`` and
other joys).
Note that {mapname} is first checked if it is an alias to another name.
.. versionchanged:: 2.5
The {domain} argument allows to override the NIS domain used for the lookup. If
unspecified, lookup is in the default NIS domain.
cat(mapname[, domain=default_domain])~
Return a dictionary mapping {key} to {value} such that ``match(key,
mapname)==value``. Note that both keys and values of the dictionary are
arbitrary arrays of bytes.
Note that {mapname} is first checked if it is an alias to another name.
.. versionchanged:: 2.5
The {domain} argument allows to override the NIS domain used for the lookup. If
unspecified, lookup is in the default NIS domain.
maps([domain=default_domain])~
Return a list of all valid maps.
.. versionchanged:: 2.5
The {domain} argument allows to override the NIS domain used for the lookup. If
unspecified, lookup is in the default NIS domain.
get_default_domain()~
Return the system default NIS domain.
.. versionadded:: 2.5
The nis (|py2stdlib-nis|) module defines the following exception:
error~
An error raised when a NIS function returns an error code.
==============================================================================
*py2stdlib-nntplib*
nntplib~
:synopsis: NNTP protocol client (requires sockets).
.. index::
pair: NNTP; protocol
single: Network News Transfer Protocol
This module defines the class NNTP which implements the client side of
the NNTP protocol. It can be used to implement a news reader or poster, or
automated news processors. For more information on NNTP (Network News Transfer
Protocol), see Internet 977.
Here are two small examples of how it can be used. To list some statistics
about a newsgroup and print the subjects of the last 10 articles:: >
>>> s = NNTP('news.cwi.nl')
>>> resp, count, first, last, name = s.group('comp.lang.python')
>>> print 'Group', name, 'has', count, 'articles, range', first, 'to', last
Group comp.lang.python has 59 articles, range 3742 to 3803
>>> resp, subs = s.xhdr('subject', first + '-' + last)
>>> for id, sub in subs[-10:]: print id, sub
...
3792 Re: Removing elements from a list while iterating...
3793 Re: Who likes Info files?
3794 Emacs and doc strings
3795 a few questions about the Mac implementation
3796 Re: executable python scripts
3797 Re: executable python scripts
3798 Re: a few questions about the Mac implementation
3799 Re: PROPOSAL: A Generic Python Object Interface for Python C Modules
3802 Re: executable python scripts
3803 Re: \POSIX{} wait and SIGCHLD
>>> s.quit()
'205 news.cwi.nl closing connection. Goodbye.'
<
To post an article from a file (this assumes that the article has valid
headers):: >
>>> s = NNTP('news.cwi.nl')
>>> f = open('/tmp/article')
>>> s.post(f)
'240 Article posted successfully.'
>>> s.quit()
'205 news.cwi.nl closing connection. Goodbye.'
<
The module itself defines the following items:
NNTP(host[, port [, user[, password [, readermode] [, usenetrc]]]])~
Return a new instance of the NNTP class, representing a connection
to the NNTP server running on host {host}, listening at port {port}. The
default {port} is 119. If the optional {user} and {password} are provided,
or if suitable credentials are present in /.netrc and the optional
flag {usenetrc} is true (the default), the ``AUTHINFO USER`` and ``AUTHINFO
PASS`` commands are used to identify and authenticate the user to the server.
If the optional flag {readermode} is true, then a ``mode reader`` command is
sent before authentication is performed. Reader mode is sometimes necessary
if you are connecting to an NNTP server on the local machine and intend to
call reader-specific commands, such as ``group``. If you get unexpected
NNTPPermanentError\ s, you might need to set {readermode}.
{readermode} defaults to ``None``. {usenetrc} defaults to ``True``.
.. versionchanged:: 2.4
{usenetrc} argument added.
NNTPError~
Derived from the standard exception Exception, this is the base class for
all exceptions raised by the nntplib (|py2stdlib-nntplib|) module.
NNTPReplyError~
Exception raised when an unexpected reply is received from the server. For
backwards compatibility, the exception ``error_reply`` is equivalent to this
class.
NNTPTemporaryError~
Exception raised when an error code in the range 400--499 is received. For
backwards compatibility, the exception ``error_temp`` is equivalent to this
class.
NNTPPermanentError~
Exception raised when an error code in the range 500--599 is received. For
backwards compatibility, the exception ``error_perm`` is equivalent to this
class.
NNTPProtocolError~
Exception raised when a reply is received from the server that does not begin
with a digit in the range 1--5. For backwards compatibility, the exception
``error_proto`` is equivalent to this class.
NNTPDataError~
Exception raised when there is some error in the response data. For backwards
compatibility, the exception ``error_data`` is equivalent to this class.
NNTP Objects
------------
NNTP instances have the following methods. The {response} that is returned as
the first item in the return tuple of almost all methods is the server's
response: a string beginning with a three-digit code. If the server's response
indicates an error, the method raises one of the above exceptions.
NNTP.getwelcome()~
Return the welcome message sent by the server in reply to the initial
connection. (This message sometimes contains disclaimers or help information
that may be relevant to the user.)
NNTP.set_debuglevel(level)~
Set the instance's debugging level. This controls the amount of debugging
output printed. The default, ``0``, produces no debugging output. A value of
``1`` produces a moderate amount of debugging output, generally a single line
per request or response. A value of ``2`` or higher produces the maximum amount
of debugging output, logging each line sent and received on the connection
(including message text).
NNTP.newgroups(date, time, [file])~
Send a ``NEWGROUPS`` command. The {date} argument should be a string of the
form ``'yymmdd'`` indicating the date, and {time} should be a string of the form
``'hhmmss'`` indicating the time. Return a pair ``(response, groups)`` where
{groups} is a list of group names that are new since the given date and time. If
the {file} parameter is supplied, then the output of the ``NEWGROUPS`` command
is stored in a file. If {file} is a string, then the method will open a file
object with that name, write to it then close it. If {file} is a file object,
then it will start calling write on it to store the lines of the command
output. If {file} is supplied, then the returned {list} is an empty list.
NNTP.newnews(group, date, time, [file])~
Send a ``NEWNEWS`` command. Here, {group} is a group name or ``'*'``, and
{date} and {time} have the same meaning as for newgroups. Return a pair
``(response, articles)`` where {articles} is a list of message ids. If the
{file} parameter is supplied, then the output of the ``NEWNEWS`` command is
stored in a file. If {file} is a string, then the method will open a file
object with that name, write to it then close it. If {file} is a file object,
then it will start calling write on it to store the lines of the command
output. If {file} is supplied, then the returned {list} is an empty list.
NNTP.list([file])~
Send a ``LIST`` command. Return a pair ``(response, list)`` where {list} is a
list of tuples. Each tuple has the form ``(group, last, first, flag)``, where
{group} is a group name, {last} and {first} are the last and first article
numbers (as strings), and {flag} is ``'y'`` if posting is allowed, ``'n'`` if
not, and ``'m'`` if the newsgroup is moderated. (Note the ordering: {last},
{first}.) If the {file} parameter is supplied, then the output of the ``LIST``
command is stored in a file. If {file} is a string, then the method will open
a file object with that name, write to it then close it. If {file} is a file
object, then it will start calling write on it to store the lines of the
command output. If {file} is supplied, then the returned {list} is an empty
list.
NNTP.descriptions(grouppattern)~
Send a ``LIST NEWSGROUPS`` command, where {grouppattern} is a wildmat string as
specified in RFC2980 (it's essentially the same as DOS or UNIX shell wildcard
strings). Return a pair ``(response, list)``, where {list} is a list of tuples
containing ``(name, title)``.
.. versionadded:: 2.4
NNTP.description(group)~
Get a description for a single group {group}. If more than one group matches
(if 'group' is a real wildmat string), return the first match. If no group
matches, return an empty string.
This elides the response code from the server. If the response code is needed,
use descriptions.
.. versionadded:: 2.4
NNTP.group(name)~
Send a ``GROUP`` command, where {name} is the group name. Return a tuple
``(response, count, first, last, name)`` where {count} is the (estimated) number
of articles in the group, {first} is the first article number in the group,
{last} is the last article number in the group, and {name} is the group name.
The numbers are returned as strings.
NNTP.help([file])~
Send a ``HELP`` command. Return a pair ``(response, list)`` where {list} is a
list of help strings. If the {file} parameter is supplied, then the output of
the ``HELP`` command is stored in a file. If {file} is a string, then the
method will open a file object with that name, write to it then close it. If
{file} is a file object, then it will start calling write on it to store
the lines of the command output. If {file} is supplied, then the returned {list}
is an empty list.
NNTP.stat(id)~
Send a ``STAT`` command, where {id} is the message id (enclosed in ``'<'`` and
``'>'``) or an article number (as a string). Return a triple ``(response,
number, id)`` where {number} is the article number (as a string) and {id} is the
message id (enclosed in ``'<'`` and ``'>'``).
NNTP.next()~
Send a ``NEXT`` command. Return as for stat (|py2stdlib-stat|).
NNTP.last()~
Send a ``LAST`` command. Return as for stat (|py2stdlib-stat|).
NNTP.head(id)~
Send a ``HEAD`` command, where {id} has the same meaning as for stat (|py2stdlib-stat|).
Return a tuple ``(response, number, id, list)`` where the first three are the
same as for stat (|py2stdlib-stat|), and {list} is a list of the article's headers (an
uninterpreted list of lines, without trailing newlines).
NNTP.body(id,[file])~
Send a ``BODY`` command, where {id} has the same meaning as for stat (|py2stdlib-stat|).
If the {file} parameter is supplied, then the body is stored in a file. If
{file} is a string, then the method will open a file object with that name,
write to it then close it. If {file} is a file object, then it will start
calling write on it to store the lines of the body. Return as for
head. If {file} is supplied, then the returned {list} is an empty list.
NNTP.article(id)~
Send an ``ARTICLE`` command, where {id} has the same meaning as for
stat (|py2stdlib-stat|). Return as for head.
NNTP.slave()~
Send a ``SLAVE`` command. Return the server's {response}.
NNTP.xhdr(header, string, [file])~
Send an ``XHDR`` command. This command is not defined in the RFC but is a
common extension. The {header} argument is a header keyword, e.g.
``'subject'``. The {string} argument should have the form ``'first-last'``
where {first} and {last} are the first and last article numbers to search.
Return a pair ``(response, list)``, where {list} is a list of pairs ``(id,
text)``, where {id} is an article number (as a string) and {text} is the text of
the requested header for that article. If the {file} parameter is supplied, then
the output of the ``XHDR`` command is stored in a file. If {file} is a string,
then the method will open a file object with that name, write to it then close
it. If {file} is a file object, then it will start calling write on it
to store the lines of the command output. If {file} is supplied, then the
returned {list} is an empty list.
NNTP.post(file)~
Post an article using the ``POST`` command. The {file} argument is an open file
object which is read until EOF using its readline (|py2stdlib-readline|) method. It should be
a well-formed news article, including the required headers. The post
method automatically escapes lines beginning with ``.``.
NNTP.ihave(id, file)~
Send an ``IHAVE`` command. {id} is a message id (enclosed in ``'<'`` and
``'>'``). If the response is not an error, treat {file} exactly as for the
post method.
NNTP.date()~
Return a triple ``(response, date, time)``, containing the current date and time
in a form suitable for the newnews and newgroups methods. This
is an optional NNTP extension, and may not be supported by all servers.
NNTP.xgtitle(name, [file])~
Process an ``XGTITLE`` command, returning a pair ``(response, list)``, where
{list} is a list of tuples containing ``(name, title)``. If the {file} parameter
is supplied, then the output of the ``XGTITLE`` command is stored in a file.
If {file} is a string, then the method will open a file object with that name,
write to it then close it. If {file} is a file object, then it will start
calling write on it to store the lines of the command output. If {file}
is supplied, then the returned {list} is an empty list. This is an optional NNTP
extension, and may not be supported by all servers.
RFC2980 says "It is suggested that this extension be deprecated". Use
descriptions or description instead.
NNTP.xover(start, end, [file])~
Return a pair ``(resp, list)``. {list} is a list of tuples, one for each
article in the range delimited by the {start} and {end} article numbers. Each
tuple is of the form ``(article number, subject, poster, date, id, references,
size, lines)``. If the {file} parameter is supplied, then the output of the
``XOVER`` command is stored in a file. If {file} is a string, then the method
will open a file object with that name, write to it then close it. If {file}
is a file object, then it will start calling write on it to store the
lines of the command output. If {file} is supplied, then the returned {list} is
an empty list. This is an optional NNTP extension, and may not be supported by
all servers.
NNTP.xpath(id)~
Return a pair ``(resp, path)``, where {path} is the directory path to the
article with message ID {id}. This is an optional NNTP extension, and may not
be supported by all servers.
NNTP.quit()~
Send a ``QUIT`` command and close the connection. Once this method has been
called, no other methods of the NNTP object should be called.
==============================================================================
*py2stdlib-numbers*
numbers~
:synopsis: Numeric abstract base classes (Complex, Real, Integral, etc.).
.. versionadded:: 2.6
The numbers (|py2stdlib-numbers|) module (3141) defines a hierarchy of numeric abstract
base classes which progressively define more operations. None of the types
defined in this module can be instantiated.
Number~
The root of the numeric hierarchy. If you just want to check if an argument
{x} is a number, without caring what kind, use ``isinstance(x, Number)``.
The numeric tower
-----------------
Complex~
Subclasses of this type describe complex numbers and include the operations
that work on the built-in complex type. These are: conversions to
complex and bool, .real, .imag, ``+``,
``-``, ``*``, ``/``, abs, conjugate, ``==``, and ``!=``. All
except ``-`` and ``!=`` are abstract.
real~
Abstract. Retrieves the real component of this number.
imag~
Abstract. Retrieves the imaginary component of this number.
conjugate()~
Abstract. Returns the complex conjugate. For example, ``(1+3j).conjugate()
== (1-3j)``.
Real~
To Complex, Real adds the operations that work on real
numbers.
In short, those are: a conversion to float, trunc,
round, math.floor, math.ceil, divmod, ``//``,
``%``, ``<``, ``<=``, ``>``, and ``>=``.
Real also provides defaults for complex, Complex.real,
Complex.imag, and Complex.conjugate.
Rational~
Subtypes Real and adds
Rational.numerator and Rational.denominator properties, which
should be in lowest terms. With these, it provides a default for
float.
numerator~
Abstract.
denominator~
Abstract.
Integral~
Subtypes Rational and adds a conversion to int.
Provides defaults for float, Rational.numerator, and
Rational.denominator, and bit-string operations: ``<<``,
``>>``, ``&``, ``^``, ``|``, ``~``.
Notes for type implementors
---------------------------
Implementors should be careful to make equal numbers equal and hash
them to the same values. This may be subtle if there are two different
extensions of the real numbers. For example, fractions.Fraction
implements hash as follows:: >
def __hash__(self):
if self.denominator == 1:
# Get integers right.
return hash(self.numerator)
# Expensive check, but definitely correct.
if self == float(self):
return hash(float(self))
else:
# Use tuple's hash to avoid a high collision rate on
# simple fractions.
return hash((self.numerator, self.denominator))
<
Adding More Numeric ABCs
There are, of course, more possible ABCs for numbers, and this would
be a poor hierarchy if it precluded the possibility of adding
those. You can add ``MyFoo`` between Complex and
Real with:: >
class MyFoo(Complex): ...
MyFoo.register(Real)
<
Implementing the arithmetic operations
We want to implement the arithmetic operations so that mixed-mode
operations either call an implementation whose author knew about the
types of both arguments, or convert both to the nearest built in type
and do the operation there. For subtypes of Integral, this
means that __add__ and __radd__ should be defined as:: >
class MyIntegral(Integral):
def __add__(self, other):
if isinstance(other, MyIntegral):
return do_my_adding_stuff(self, other)
elif isinstance(other, OtherTypeIKnowAbout):
return do_my_other_adding_stuff(self, other)
else:
return NotImplemented
def __radd__(self, other):
if isinstance(other, MyIntegral):
return do_my_adding_stuff(other, self)
elif isinstance(other, OtherTypeIKnowAbout):
return do_my_other_adding_stuff(other, self)
elif isinstance(other, Integral):
return int(other) + int(self)
elif isinstance(other, Real):
return float(other) + float(self)
elif isinstance(other, Complex):
return complex(other) + complex(self)
else:
return NotImplemented
<
There are 5 different cases for a mixed-type operation on subclasses
of Complex. I'll refer to all of the above code that doesn't
refer to ``MyIntegral`` and ``OtherTypeIKnowAbout`` as
"boilerplate". ``a`` will be an instance of ``A``, which is a subtype
of Complex (``a : A <: Complex``), and ``b : B <:
Complex``. I'll consider ``a + b``:
1. If ``A`` defines an __add__ which accepts ``b``, all is
well.
2. If ``A`` falls back to the boilerplate code, and it were to
return a value from __add__, we'd miss the possibility
that ``B`` defines a more intelligent __radd__, so the
boilerplate should return NotImplemented from
__add__. (Or ``A`` may not implement __add__ at
all.)
3. Then ``B``'s __radd__ gets a chance. If it accepts
``a``, all is well.
4. If it falls back to the boilerplate, there are no more possible
methods to try, so this is where the default implementation
should live.
5. If ``B <: A``, Python tries ``B.__radd__`` before
``A.__add__``. This is ok, because it was implemented with
knowledge of ``A``, so it can handle those instances before
delegating to Complex.
If ``A <: Complex`` and ``B <: Real`` without sharing any other knowledge,
then the appropriate shared operation is the one involving the built
in complex, and both __radd__ s land there, so ``a+b
== b+a``.
Because most of the operations on any given type will be very similar,
it can be useful to define a helper function which generates the
forward and reverse instances of any given operator. For example,
fractions.Fraction uses:: >
def _operator_fallbacks(monomorphic_operator, fallback_operator):
def forward(a, b):
if isinstance(b, (int, long, Fraction)):
return monomorphic_operator(a, b)
elif isinstance(b, float):
return fallback_operator(float(a), b)
elif isinstance(b, complex):
return fallback_operator(complex(a), b)
else:
return NotImplemented
forward.__name__ = '__' + fallback_operator.__name__ + '__'
forward.__doc__ = monomorphic_operator.__doc__
def reverse(b, a):
if isinstance(a, Rational):
# Includes ints.
return monomorphic_operator(a, b)
elif isinstance(a, numbers.Real):
return fallback_operator(float(a), float(b))
elif isinstance(a, numbers.Complex):
return fallback_operator(complex(a), complex(b))
else:
return NotImplemented
reverse.__name__ = '__r' + fallback_operator.__name__ + '__'
reverse.__doc__ = monomorphic_operator.__doc__
return forward, reverse
def _add(a, b):
"""a + b"""
return Fraction(a.numerator * b.denominator +
b.numerator * a.denominator,
a.denominator * b.denominator)
__add__, __radd__ = _operator_fallbacks(_add, operator.add)
# ...
==============================================================================
*py2stdlib-nav*
Nav~
:platform: Mac
:synopsis: Interface to Navigation Services.
:deprecated:
A low-level interface to Navigation Services.
2.6~
PixMapWrapper (|py2stdlib-pixmapwrapper|) --- Wrapper for PixMap objects
---------------------------------------------------
==============================================================================
*py2stdlib-operator*
operator~
:synopsis: Functions corresponding to the standard operators.
.. testsetup::
import operator
from operator import itemgetter
The operator (|py2stdlib-operator|) module exports a set of functions implemented in C
corresponding to the intrinsic operators of Python. For example,
``operator.add(x, y)`` is equivalent to the expression ``x+y``. The function
names are those used for special class methods; variants without leading and
trailing ``__`` are also provided for convenience.
The functions fall into categories that perform object comparisons, logical
operations, mathematical operations, sequence operations, and abstract type
tests.
The object comparison functions are useful for all objects, and are named after
the rich comparison operators they support:
lt(a, b)~
le(a, b)
eq(a, b)
ne(a, b)
ge(a, b)
gt(a, b)
__lt__(a, b)
__le__(a, b)
__eq__(a, b)
__ne__(a, b)
__ge__(a, b)
__gt__(a, b)
Perform "rich comparisons" between {a} and {b}. Specifically, ``lt(a, b)`` is
equivalent to ``a < b``, ``le(a, b)`` is equivalent to ``a <= b``, ``eq(a,
b)`` is equivalent to ``a == b``, ``ne(a, b)`` is equivalent to ``a != b``,
``gt(a, b)`` is equivalent to ``a > b`` and ``ge(a, b)`` is equivalent to ``a
>= b``. Note that unlike the built-in cmp, these functions can
return any value, which may or may not be interpretable as a Boolean value.
See comparisons for more information about rich comparisons.
.. versionadded:: 2.2
The logical operations are also generally applicable to all objects, and support
truth tests, identity tests, and boolean operations:
not_(obj)~
__not__(obj)
Return the outcome of not {obj}. (Note that there is no
__not__ method for object instances; only the interpreter core defines
this operation. The result is affected by the __nonzero__ and
__len__ methods.)
truth(obj)~
Return True if {obj} is true, and False otherwise. This is
equivalent to using the bool constructor.
is_(a, b)~
Return ``a is b``. Tests object identity.
.. versionadded:: 2.3
is_not(a, b)~
Return ``a is not b``. Tests object identity.
.. versionadded:: 2.3
The mathematical and bitwise operations are the most numerous:
abs(obj)~
__abs__(obj)
Return the absolute value of {obj}.
add(a, b)~
__add__(a, b)
Return ``a + b``, for {a} and {b} numbers.
and_(a, b)~
__and__(a, b)
Return the bitwise and of {a} and {b}.
div(a, b)~
__div__(a, b)
Return ``a / b`` when ``__future__.division`` is not in effect. This is
also known as "classic" division.
floordiv(a, b)~
__floordiv__(a, b)
Return ``a // b``.
.. versionadded:: 2.2
index(a)~
__index__(a)
Return {a} converted to an integer. Equivalent to ``a.__index__()``.
.. versionadded:: 2.5
inv(obj)~
invert(obj)
__inv__(obj)
__invert__(obj)
Return the bitwise inverse of the number {obj}. This is equivalent to ``~obj``.
.. versionadded:: 2.0
The names invert and __invert__.
lshift(a, b)~
__lshift__(a, b)
Return {a} shifted left by {b}.
mod(a, b)~
__mod__(a, b)
Return ``a % b``.
mul(a, b)~
__mul__(a, b)
Return ``a { b``, for }a{ and }b* numbers.
neg(obj)~
__neg__(obj)
Return {obj} negated (``-obj``).
or_(a, b)~
__or__(a, b)
Return the bitwise or of {a} and {b}.
pos(obj)~
__pos__(obj)
Return {obj} positive (``+obj``).
pow(a, b)~
__pow__(a, b)
Return ``a { b``, for }a{ and }b* numbers.
.. versionadded:: 2.3
rshift(a, b)~
__rshift__(a, b)
Return {a} shifted right by {b}.
sub(a, b)~
__sub__(a, b)
Return ``a - b``.
truediv(a, b)~
__truediv__(a, b)
Return ``a / b`` when ``__future__.division`` is in effect. This is also
known as "true" division.
.. versionadded:: 2.2
xor(a, b)~
__xor__(a, b)
Return the bitwise exclusive or of {a} and {b}.
Operations which work with sequences (some of them with mappings too) include:
concat(a, b)~
__concat__(a, b)
Return ``a + b`` for {a} and {b} sequences.
contains(a, b)~
__contains__(a, b)
Return the outcome of the test ``b in a``. Note the reversed operands.
.. versionadded:: 2.0
The name __contains__.
countOf(a, b)~
Return the number of occurrences of {b} in {a}.
delitem(a, b)~
__delitem__(a, b)
Remove the value of {a} at index {b}.
delslice(a, b, c)~
__delslice__(a, b, c)
Delete the slice of {a} from index {b} to index {c-1}.
2.6~
This function is removed in Python 3.x. Use delitem with a slice
index.
getitem(a, b)~
__getitem__(a, b)
Return the value of {a} at index {b}.
getslice(a, b, c)~
__getslice__(a, b, c)
Return the slice of {a} from index {b} to index {c-1}.
2.6~
This function is removed in Python 3.x. Use getitem with a slice
index.
indexOf(a, b)~
Return the index of the first of occurrence of {b} in {a}.
repeat(a, b)~
__repeat__(a, b)
2.7~
Use __mul__ instead.
Return ``a { b`` where }a{ is a sequence and }b* is an integer.
sequenceIncludes(...)~
2.0~
Use contains instead.
Alias for contains.
setitem(a, b, c)~
__setitem__(a, b, c)
Set the value of {a} at index {b} to {c}.
setslice(a, b, c, v)~
__setslice__(a, b, c, v)
Set the slice of {a} from index {b} to index {c-1} to the sequence {v}.
2.6~
This function is removed in Python 3.x. Use setitem with a slice
index.
Example use of operator functions:: >
>>> # Elementwise multiplication
>>> map(mul, [0, 1, 2, 3], [10, 20, 30, 40])
[0, 20, 60, 120]
>>> # Dot product
>>> sum(map(mul, [0, 1, 2, 3], [10, 20, 30, 40]))
200
<
Many operations have an "in-place" version. The following functions provide a
more primitive access to in-place operators than the usual syntax does; for
example, the statement ``x += y`` is equivalent to
``x = operator.iadd(x, y)``. Another way to put it is to say that
``z = operator.iadd(x, y)`` is equivalent to the compound statement
``z = x; z += y``.
iadd(a, b)~
__iadd__(a, b)
``a = iadd(a, b)`` is equivalent to ``a += b``.
.. versionadded:: 2.5
iand(a, b)~
__iand__(a, b)
``a = iand(a, b)`` is equivalent to ``a &= b``.
.. versionadded:: 2.5
iconcat(a, b)~
__iconcat__(a, b)
``a = iconcat(a, b)`` is equivalent to ``a += b`` for {a} and {b} sequences.
.. versionadded:: 2.5
idiv(a, b)~
__idiv__(a, b)
``a = idiv(a, b)`` is equivalent to ``a /= b`` when ``__future__.division`` is
not in effect.
.. versionadded:: 2.5
ifloordiv(a, b)~
__ifloordiv__(a, b)
``a = ifloordiv(a, b)`` is equivalent to ``a //= b``.
.. versionadded:: 2.5
ilshift(a, b)~
__ilshift__(a, b)
``a = ilshift(a, b)`` is equivalent to ``a <<= b``.
.. versionadded:: 2.5
imod(a, b)~
__imod__(a, b)
``a = imod(a, b)`` is equivalent to ``a %= b``.
.. versionadded:: 2.5
imul(a, b)~
__imul__(a, b)
``a = imul(a, b)`` is equivalent to ``a *= b``.
.. versionadded:: 2.5
ior(a, b)~
__ior__(a, b)
``a = ior(a, b)`` is equivalent to ``a |= b``.
.. versionadded:: 2.5
ipow(a, b)~
__ipow__(a, b)
``a = ipow(a, b)`` is equivalent to ``a {}= b``.
.. versionadded:: 2.5
irepeat(a, b)~
__irepeat__(a, b)
2.7~
Use __imul__ instead.
``a = irepeat(a, b)`` is equivalent to ``a {= b`` where }a* is a sequence and
{b} is an integer.
.. versionadded:: 2.5
irshift(a, b)~
__irshift__(a, b)
``a = irshift(a, b)`` is equivalent to ``a >>= b``.
.. versionadded:: 2.5
isub(a, b)~
__isub__(a, b)
``a = isub(a, b)`` is equivalent to ``a -= b``.
.. versionadded:: 2.5
itruediv(a, b)~
__itruediv__(a, b)
``a = itruediv(a, b)`` is equivalent to ``a /= b`` when ``__future__.division``
is in effect.
.. versionadded:: 2.5
ixor(a, b)~
__ixor__(a, b)
``a = ixor(a, b)`` is equivalent to ``a ^= b``.
.. versionadded:: 2.5
The operator (|py2stdlib-operator|) module also defines a few predicates to test the type of
objects; however, these are not all reliable. It is preferable to test
abstract base classes instead (see collections (|py2stdlib-collections|) and
numbers (|py2stdlib-numbers|) for details).
isCallable(obj)~
2.0~
Use ``isinstance(x, collections.Callable)`` instead.
Returns true if the object {obj} can be called like a function, otherwise it
returns false. True is returned for functions, bound and unbound methods, class
objects, and instance objects which support the __call__ method.
isMappingType(obj)~
2.7~
Use ``isinstance(x, collections.Mapping)`` instead.
Returns true if the object {obj} supports the mapping interface. This is true for
dictionaries and all instance objects defining __getitem__.
isNumberType(obj)~
2.7~
Use ``isinstance(x, numbers.Number)`` instead.
Returns true if the object {obj} represents a number. This is true for all
numeric types implemented in C.
isSequenceType(obj)~
2.7~
Use ``isinstance(x, collections.Sequence)`` instead.
Returns true if the object {obj} supports the sequence protocol. This returns true
for all objects which define sequence methods in C, and for all instance objects
defining __getitem__.
The operator (|py2stdlib-operator|) module also defines tools for generalized attribute and item
lookups. These are useful for making fast field extractors as arguments for
map, sorted, itertools.groupby, or other functions that
expect a function argument.
attrgetter(attr[, args...])~
Return a callable object that fetches {attr} from its operand. If more than one
attribute is requested, returns a tuple of attributes. After,
``f = attrgetter('name')``, the call ``f(b)`` returns ``b.name``. After,
``f = attrgetter('name', 'date')``, the call ``f(b)`` returns ``(b.name,
b.date)``.
The attribute names can also contain dots; after ``f = attrgetter('date.month')``,
the call ``f(b)`` returns ``b.date.month``.
.. versionadded:: 2.4
.. versionchanged:: 2.5
Added support for multiple attributes.
.. versionchanged:: 2.6
Added support for dotted attributes.
itemgetter(item[, args...])~
Return a callable object that fetches {item} from its operand using the
operand's __getitem__ method. If multiple items are specified,
returns a tuple of lookup values. Equivalent to:: >
def itemgetter(*items):
if len(items) == 1:
item = items[0]
def g(obj):
return obj[item]
else:
def g(obj):
return tuple(obj[item] for item in items)
return g
<
The items can be any type accepted by the operand's __getitem__
method. Dictionaries accept any hashable value. Lists, tuples, and
strings accept an index or a slice:
>>> itemgetter(1)('ABCDEFG')
'B'
>>> itemgetter(1,3,5)('ABCDEFG')
('B', 'D', 'F')
>>> itemgetter(slice(2,None))('ABCDEFG')
'CDEFG'
.. versionadded:: 2.4
.. versionchanged:: 2.5
Added support for multiple item extraction.
Example of using itemgetter to retrieve specific fields from a
tuple record:
>>> inventory = [('apple', 3), ('banana', 2), ('pear', 5), ('orange', 1)]
>>> getcount = itemgetter(1)
>>> map(getcount, inventory)
[3, 2, 5, 1]
>>> sorted(inventory, key=getcount)
[('orange', 1), ('banana', 2), ('apple', 3), ('pear', 5)]
methodcaller(name[, args...])~
Return a callable object that calls the method {name} on its operand. If
additional arguments and/or keyword arguments are given, they will be given
to the method as well. After ``f = methodcaller('name')``, the call ``f(b)``
returns ``b.name()``. After ``f = methodcaller('name', 'foo', bar=1)``, the
call ``f(b)`` returns ``b.name('foo', bar=1)``.
.. versionadded:: 2.6
Mapping Operators to Functions
------------------------------
This table shows how abstract operations correspond to operator symbols in the
Python syntax and the functions in the operator (|py2stdlib-operator|) module.
+-----------------------+-------------------------+---------------------------------------+
| Operation | Syntax | Function |
+=======================+=========================+=======================================+
| Addition | ``a + b`` | ``add(a, b)`` |
+-----------------------+-------------------------+---------------------------------------+
| Concatenation | ``seq1 + seq2`` | ``concat(seq1, seq2)`` |
+-----------------------+-------------------------+---------------------------------------+
| Containment Test | ``obj in seq`` | ``contains(seq, obj)`` |
+-----------------------+-------------------------+---------------------------------------+
| Division | ``a / b`` | ``div(a, b)`` (without |
| | | ``__future__.division``) |
+-----------------------+-------------------------+---------------------------------------+
| Division | ``a / b`` | ``truediv(a, b)`` (with |
| | | ``__future__.division``) |
+-----------------------+-------------------------+---------------------------------------+
| Division | ``a // b`` | ``floordiv(a, b)`` |
+-----------------------+-------------------------+---------------------------------------+
| Bitwise And | ``a & b`` | ``and_(a, b)`` |
+-----------------------+-------------------------+---------------------------------------+
| Bitwise Exclusive Or | ``a ^ b`` | ``xor(a, b)`` |
+-----------------------+-------------------------+---------------------------------------+
| Bitwise Inversion | ``~ a`` | ``invert(a)`` |
+-----------------------+-------------------------+---------------------------------------+
| Bitwise Or | ``a | b`` | ``or_(a, b)`` |
+-----------------------+-------------------------+---------------------------------------+
| Exponentiation | ``a {} b`` | ``pow(a, b)`` |
+-----------------------+-------------------------+---------------------------------------+
| Identity | ``a is b`` | ``is_(a, b)`` |
+-----------------------+-------------------------+---------------------------------------+
| Identity | ``a is not b`` | ``is_not(a, b)`` |
+-----------------------+-------------------------+---------------------------------------+
| Indexed Assignment | ``obj[k] = v`` | ``setitem(obj, k, v)`` |
+-----------------------+-------------------------+---------------------------------------+
| Indexed Deletion | ``del obj[k]`` | ``delitem(obj, k)`` |
+-----------------------+-------------------------+---------------------------------------+
| Indexing | ``obj[k]`` | ``getitem(obj, k)`` |
+-----------------------+-------------------------+---------------------------------------+
| Left Shift | ``a << b`` | ``lshift(a, b)`` |
+-----------------------+-------------------------+---------------------------------------+
| Modulo | ``a % b`` | ``mod(a, b)`` |
+-----------------------+-------------------------+---------------------------------------+
| Multiplication | ``a * b`` | ``mul(a, b)`` |
+-----------------------+-------------------------+---------------------------------------+
| Negation (Arithmetic) | ``- a`` | ``neg(a)`` |
+-----------------------+-------------------------+---------------------------------------+
| Negation (Logical) | ``not a`` | ``not_(a)`` |
+-----------------------+-------------------------+---------------------------------------+
| Positive | ``+ a`` | ``pos(a)`` |
+-----------------------+-------------------------+---------------------------------------+
| Right Shift | ``a >> b`` | ``rshift(a, b)`` |
+-----------------------+-------------------------+---------------------------------------+
| Sequence Repetition | ``seq * i`` | ``repeat(seq, i)`` |
+-----------------------+-------------------------+---------------------------------------+
| Slice Assignment | ``seq[i:j] = values`` | ``setitem(seq, slice(i, j), values)`` |
+-----------------------+-------------------------+---------------------------------------+
| Slice Deletion | ``del seq[i:j]`` | ``delitem(seq, slice(i, j))`` |
+-----------------------+-------------------------+---------------------------------------+
| Slicing | ``seq[i:j]`` | ``getitem(seq, slice(i, j))`` |
+-----------------------+-------------------------+---------------------------------------+
| String Formatting | ``s % obj`` | ``mod(s, obj)`` |
+-----------------------+-------------------------+---------------------------------------+
| Subtraction | ``a - b`` | ``sub(a, b)`` |
+-----------------------+-------------------------+---------------------------------------+
| Truth Test | ``obj`` | ``truth(obj)`` |
+-----------------------+-------------------------+---------------------------------------+
| Ordering | ``a < b`` | ``lt(a, b)`` |
+-----------------------+-------------------------+---------------------------------------+
| Ordering | ``a <= b`` | ``le(a, b)`` |
+-----------------------+-------------------------+---------------------------------------+
| Equality | ``a == b`` | ``eq(a, b)`` |
+-----------------------+-------------------------+---------------------------------------+
| Difference | ``a != b`` | ``ne(a, b)`` |
+-----------------------+-------------------------+---------------------------------------+
| Ordering | ``a >= b`` | ``ge(a, b)`` |
+-----------------------+-------------------------+---------------------------------------+
| Ordering | ``a > b`` | ``gt(a, b)`` |
+-----------------------+-------------------------+---------------------------------------+
==============================================================================
*py2stdlib-optparse*
optparse~
:synopsis: Command-line option parsing library.
:deprecated:
2.7~
The optparse (|py2stdlib-optparse|) module is deprecated and will not be developed further;
development will continue with the argparse (|py2stdlib-argparse|) module.
.. versionadded:: 2.3
optparse (|py2stdlib-optparse|) is a more convenient, flexible, and powerful library for parsing
command-line options than the old getopt (|py2stdlib-getopt|) module. optparse (|py2stdlib-optparse|) uses a
more declarative style of command-line parsing: you create an instance of
OptionParser, populate it with options, and parse the command
line. optparse (|py2stdlib-optparse|) allows users to specify options in the conventional
GNU/POSIX syntax, and additionally generates usage and help messages for you.
Here's an example of using optparse (|py2stdlib-optparse|) in a simple script:: >
from optparse import OptionParser
[...]
parser = OptionParser()
parser.add_option("-f", "--file", dest="filename",
help="write report to FILE", metavar="FILE")
parser.add_option("-q", "--quiet",
action="store_false", dest="verbose", default=True,
help="don't print status messages to stdout")
(options, args) = parser.parse_args()
<
With these few lines of code, users of your script can now do the "usual thing"
on the command-line, for example:: >
<yourscript> --file=outfile -q
<
As it parses the command line, optparse (|py2stdlib-optparse|) sets attributes of the
``options`` object returned by parse_args based on user-supplied
command-line values. When parse_args returns from parsing this command
line, ``options.filename`` will be ``"outfile"`` and ``options.verbose`` will be
``False``. optparse (|py2stdlib-optparse|) supports both long and short options, allows short
options to be merged together, and allows options to be associated with their
arguments in a variety of ways. Thus, the following command lines are all
equivalent to the above example:: >
<yourscript> -f outfile --quiet
<yourscript> --quiet --file outfile
<yourscript> -q -foutfile
<yourscript> -qfoutfile
<
Additionally, users can run one of ::
<yourscript> -h
<yourscript> --help
and optparse (|py2stdlib-optparse|) will print out a brief summary of your script's options:
.. code-block:: text
usage: <yourscript> [options]
options:
-h, --help show this help message and exit
-f FILE, --file=FILE write report to FILE
-q, --quiet don't print status messages to stdout
where the value of {yourscript} is determined at runtime (normally from
``sys.argv[0]``).
Background
----------
optparse (|py2stdlib-optparse|) was explicitly designed to encourage the creation of programs
with straightforward, conventional command-line interfaces. To that end, it
supports only the most common command-line syntax and semantics conventionally
used under Unix. If you are unfamiliar with these conventions, read this
section to acquaint yourself with them.
Terminology
^^^^^^^^^^^
argument
a string entered on the command-line, and passed by the shell to ``execl()``
or ``execv()``. In Python, arguments are elements of ``sys.argv[1:]``
(``sys.argv[0]`` is the name of the program being executed). Unix shells
also use the term "word".
It is occasionally desirable to substitute an argument list other than
``sys.argv[1:]``, so you should read "argument" as "an element of
``sys.argv[1:]``, or of some other list provided as a substitute for
``sys.argv[1:]``".
option
an argument used to supply extra information to guide or customize the
execution of a program. There are many different syntaxes for options; the
traditional Unix syntax is a hyphen ("-") followed by a single letter,
e.g. ``"-x"`` or ``"-F"``. Also, traditional Unix syntax allows multiple
options to be merged into a single argument, e.g. ``"-x -F"`` is equivalent
to ``"-xF"``. The GNU project introduced ``"--"`` followed by a series of
hyphen-separated words, e.g. ``"--file"`` or ``"--dry-run"``. These are the
only two option syntaxes provided by optparse (|py2stdlib-optparse|).
Some other option syntaxes that the world has seen include:
{ a hyphen followed by a few letters, e.g. ``"-pf"`` (this is }not* the same
as multiple options merged into a single argument)
* a hyphen followed by a whole word, e.g. ``"-file"`` (this is technically
equivalent to the previous syntax, but they aren't usually seen in the same
program)
* a plus sign followed by a single letter, or a few letters, or a word, e.g.
``"+f"``, ``"+rgb"``
* a slash followed by a letter, or a few letters, or a word, e.g. ``"/f"``,
``"/file"``
These option syntaxes are not supported by optparse (|py2stdlib-optparse|), and they never
will be. This is deliberate: the first three are non-standard on any
environment, and the last only makes sense if you're exclusively targeting
VMS, MS-DOS, and/or Windows.
option argument
an argument that follows an option, is closely associated with that option,
and is consumed from the argument list when that option is. With
optparse (|py2stdlib-optparse|), option arguments may either be in a separate argument from
their option:
.. code-block:: text
-f foo
--file foo
or included in the same argument:
.. code-block:: text
-ffoo
--file=foo
Typically, a given option either takes an argument or it doesn't. Lots of
people want an "optional option arguments" feature, meaning that some options
will take an argument if they see it, and won't if they don't. This is
somewhat controversial, because it makes parsing ambiguous: if ``"-a"`` takes
an optional argument and ``"-b"`` is another option entirely, how do we
interpret ``"-ab"``? Because of this ambiguity, optparse (|py2stdlib-optparse|) does not
support this feature.
positional argument
something leftover in the argument list after options have been parsed, i.e.
after options and their arguments have been parsed and removed from the
argument list.
required option
an option that must be supplied on the command-line; note that the phrase
"required option" is self-contradictory in English. optparse (|py2stdlib-optparse|) doesn't
prevent you from implementing required options, but doesn't give you much
help at it either.
For example, consider this hypothetical command-line:: >
prog -v --report /tmp/report.txt foo bar
<
``"-v"`` and ``"--report"`` are both options. Assuming that --report
takes one argument, ``"/tmp/report.txt"`` is an option argument. ``"foo"`` and
``"bar"`` are positional arguments.
What are options for?
^^^^^^^^^^^^^^^^^^^^^
Options are used to provide extra information to tune or customize the execution
of a program. In case it wasn't clear, options are usually {optional}. A
program should be able to run just fine with no options whatsoever. (Pick a
random program from the Unix or GNU toolsets. Can it run without any options at
all and still make sense? The main exceptions are ``find``, ``tar``, and
``dd``\ ---all of which are mutant oddballs that have been rightly criticized
for their non-standard syntax and confusing interfaces.)
Lots of people want their programs to have "required options". Think about it.
If it's required, then it's {not optional}! If there is a piece of information
that your program absolutely requires in order to run successfully, that's what
positional arguments are for.
As an example of good command-line interface design, consider the humble ``cp``
utility, for copying files. It doesn't make much sense to try to copy files
without supplying a destination and at least one source. Hence, ``cp`` fails if
you run it with no arguments. However, it has a flexible, useful syntax that
does not require any options at all:: >
cp SOURCE DEST
cp SOURCE ... DEST-DIR
<
You can get pretty far with just that. Most ``cp`` implementations provide a
bunch of options to tweak exactly how the files are copied: you can preserve
mode and modification time, avoid following symlinks, ask before clobbering
existing files, etc. But none of this distracts from the core mission of
``cp``, which is to copy either one file to another, or several files to another
directory.
What are positional arguments for?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Positional arguments are for those pieces of information that your program
absolutely, positively requires to run.
A good user interface should have as few absolute requirements as possible. If
your program requires 17 distinct pieces of information in order to run
successfully, it doesn't much matter {how} you get that information from the
user---most people will give up and walk away before they successfully run the
program. This applies whether the user interface is a command-line, a
configuration file, or a GUI: if you make that many demands on your users, most
of them will simply give up.
In short, try to minimize the amount of information that users are absolutely
required to supply---use sensible defaults whenever possible. Of course, you
also want to make your programs reasonably flexible. That's what options are
for. Again, it doesn't matter if they are entries in a config file, widgets in
the "Preferences" dialog of a GUI, or command-line options---the more options
you implement, the more flexible your program is, and the more complicated its
implementation becomes. Too much flexibility has drawbacks as well, of course;
too many options can overwhelm users and make your code much harder to maintain.
Tutorial
--------
While optparse (|py2stdlib-optparse|) is quite flexible and powerful, it's also straightforward
to use in most cases. This section covers the code patterns that are common to
any optparse (|py2stdlib-optparse|)\ -based program.
First, you need to import the OptionParser class; then, early in the main
program, create an OptionParser instance:: >
from optparse import OptionParser
[...]
parser = OptionParser()
<
Then you can start defining options. The basic syntax is::
parser.add_option(opt_str, ...,
attr=value, ...)
Each option has one or more option strings, such as ``"-f"`` or ``"--file"``,
and several option attributes that tell optparse (|py2stdlib-optparse|) what to expect and what
to do when it encounters that option on the command line.
Typically, each option will have one short option string and one long option
string, e.g.:: >
parser.add_option("-f", "--file", ...)
<
You're free to define as many short option strings and as many long option
strings as you like (including zero), as long as there is at least one option
string overall.
The option strings passed to add_option are effectively labels for the
option defined by that call. For brevity, we will frequently refer to
{encountering an option} on the command line; in reality, optparse (|py2stdlib-optparse|)
encounters {option strings} and looks up options from them.
Once all of your options are defined, instruct optparse (|py2stdlib-optparse|) to parse your
program's command line:: >
(options, args) = parser.parse_args()
<
(If you like, you can pass a custom argument list to parse_args, but
that's rarely necessary: by default it uses ``sys.argv[1:]``.)
parse_args returns two values:
* ``options``, an object containing values for all of your options---e.g. if
``"--file"`` takes a single string argument, then ``options.file`` will be the
filename supplied by the user, or ``None`` if the user did not supply that
option
* ``args``, the list of positional arguments leftover after parsing options
This tutorial section only covers the four most important option attributes:
Option.action, Option.type, Option.dest
(destination), and Option.help. Of these, Option.action is the
most fundamental.
Understanding option actions
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Actions tell optparse (|py2stdlib-optparse|) what to do when it encounters an option on the
command line. There is a fixed set of actions hard-coded into optparse (|py2stdlib-optparse|);
adding new actions is an advanced topic covered in section
optparse-extending-optparse. Most actions tell optparse (|py2stdlib-optparse|) to store
a value in some variable---for example, take a string from the command line and
store it in an attribute of ``options``.
If you don't specify an option action, optparse (|py2stdlib-optparse|) defaults to ``store``.
The store action
^^^^^^^^^^^^^^^^
The most common option action is ``store``, which tells optparse (|py2stdlib-optparse|) to take
the next argument (or the remainder of the current argument), ensure that it is
of the correct type, and store it to your chosen destination.
For example:: >
parser.add_option("-f", "--file",
action="store", type="string", dest="filename")
<
Now let's make up a fake command line and ask optparse (|py2stdlib-optparse|) to parse it::
args = ["-f", "foo.txt"]
(options, args) = parser.parse_args(args)
When optparse (|py2stdlib-optparse|) sees the option string ``"-f"``, it consumes the next
argument, ``"foo.txt"``, and stores it in ``options.filename``. So, after this
call to parse_args, ``options.filename`` is ``"foo.txt"``.
Some other option types supported by optparse (|py2stdlib-optparse|) are ``int`` and ``float``.
Here's an option that expects an integer argument:: >
parser.add_option("-n", type="int", dest="num")
<
Note that this option has no long option string, which is perfectly acceptable.
Also, there's no explicit action, since the default is ``store``.
Let's parse another fake command-line. This time, we'll jam the option argument
right up against the option: since ``"-n42"`` (one argument) is equivalent to
``"-n 42"`` (two arguments), the code :: >
(options, args) = parser.parse_args(["-n42"])
print options.num
<
will print ``"42"``.
If you don't specify a type, optparse (|py2stdlib-optparse|) assumes ``string``. Combined with
the fact that the default action is ``store``, that means our first example can
be a lot shorter:: >
parser.add_option("-f", "--file", dest="filename")
<
If you don't supply a destination, optparse (|py2stdlib-optparse|) figures out a sensible
default from the option strings: if the first long option string is
``"--foo-bar"``, then the default destination is ``foo_bar``. If there are no
long option strings, optparse (|py2stdlib-optparse|) looks at the first short option string: the
default destination for ``"-f"`` is ``f``.
optparse (|py2stdlib-optparse|) also includes built-in ``long`` and ``complex`` types. Adding
types is covered in section optparse-extending-optparse.
Handling boolean (flag) options
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Flag options---set a variable to true or false when a particular option is seen
---are quite common. optparse (|py2stdlib-optparse|) supports them with two separate actions,
``store_true`` and ``store_false``. For example, you might have a ``verbose``
flag that is turned on with ``"-v"`` and off with ``"-q"``:: >
parser.add_option("-v", action="store_true", dest="verbose")
parser.add_option("-q", action="store_false", dest="verbose")
<
Here we have two different options with the same destination, which is perfectly
OK. (It just means you have to be a bit careful when setting default values---
see below.)
When optparse (|py2stdlib-optparse|) encounters ``"-v"`` on the command line, it sets
``options.verbose`` to ``True``; when it encounters ``"-q"``,
``options.verbose`` is set to ``False``.
Other actions
^^^^^^^^^^^^^
Some other actions supported by optparse (|py2stdlib-optparse|) are:
``"store_const"``
store a constant value
``"append"``
append this option's argument to a list
``"count"``
increment a counter by one
``"callback"``
call a specified function
These are covered in section optparse-reference-guide, Reference Guide
and section optparse-option-callbacks.
Default values
^^^^^^^^^^^^^^
All of the above examples involve setting some variable (the "destination") when
certain command-line options are seen. What happens if those options are never
seen? Since we didn't supply any defaults, they are all set to ``None``. This
is usually fine, but sometimes you want more control. optparse (|py2stdlib-optparse|) lets you
supply a default value for each destination, which is assigned before the
command line is parsed.
First, consider the verbose/quiet example. If we want optparse (|py2stdlib-optparse|) to set
``verbose`` to ``True`` unless ``"-q"`` is seen, then we can do this:: >
parser.add_option("-v", action="store_true", dest="verbose", default=True)
parser.add_option("-q", action="store_false", dest="verbose")
<
Since default values apply to the {destination} rather than to any particular
option, and these two options happen to have the same destination, this is
exactly equivalent:: >
parser.add_option("-v", action="store_true", dest="verbose")
parser.add_option("-q", action="store_false", dest="verbose", default=True)
<
Consider this::
parser.add_option("-v", action="store_true", dest="verbose", default=False)
parser.add_option("-q", action="store_false", dest="verbose", default=True)
Again, the default value for ``verbose`` will be ``True``: the last default
value supplied for any particular destination is the one that counts.
A clearer way to specify default values is the set_defaults method of
OptionParser, which you can call at any time before calling parse_args:: >
parser.set_defaults(verbose=True)
parser.add_option(...)
(options, args) = parser.parse_args()
<
As before, the last value specified for a given option destination is the one
that counts. For clarity, try to use one method or the other of setting default
values, not both.
Generating help
^^^^^^^^^^^^^^^
optparse (|py2stdlib-optparse|)'s ability to generate help and usage text automatically is
useful for creating user-friendly command-line interfaces. All you have to do
is supply a Option.help value for each option, and optionally a short
usage message for your whole program. Here's an OptionParser populated with
user-friendly (documented) options:: >
usage = "usage: %prog [options] arg1 arg2"
parser = OptionParser(usage=usage)
parser.add_option("-v", "--verbose",
action="store_true", dest="verbose", default=True,
help="make lots of noise [default]")
parser.add_option("-q", "--quiet",
action="store_false", dest="verbose",
help="be vewwy quiet (I'm hunting wabbits)")
parser.add_option("-f", "--filename",
metavar="FILE", help="write output to FILE")
parser.add_option("-m", "--mode",
default="intermediate",
help="interaction mode: novice, intermediate, "
"or expert [default: %default]")
<
If optparse (|py2stdlib-optparse|) encounters either ``"-h"`` or ``"--help"`` on the
command-line, or if you just call parser.print_help, it prints the
following to standard output:
.. code-block:: text
usage: <yourscript> [options] arg1 arg2
options:
-h, --help show this help message and exit
-v, --verbose make lots of noise [default]
-q, --quiet be vewwy quiet (I'm hunting wabbits)
-f FILE, --filename=FILE
write output to FILE
-m MODE, --mode=MODE interaction mode: novice, intermediate, or
expert [default: intermediate]
(If the help output is triggered by a help option, optparse (|py2stdlib-optparse|) exits after
printing the help text.)
There's a lot going on here to help optparse (|py2stdlib-optparse|) generate the best possible
help message:
* the script defines its own usage message:: >
usage = "usage: %prog [options] arg1 arg2"
optparse (|py2stdlib-optparse|) expands ``"%prog"`` in the usage string to the name of the
current program, i.e. ``os.path.basename(sys.argv[0])``. The expanded string
is then printed before the detailed option help.
If you don't supply a usage string, optparse (|py2stdlib-optparse|) uses a bland but sensible
default: ``"usage: %prog [options]"``, which is fine if your script doesn't
take any positional arguments.
<
* every option defines a help string, and doesn't worry about line-wrapping---
optparse (|py2stdlib-optparse|) takes care of wrapping lines and making the help output look
good.
* options that take a value indicate this fact in their automatically-generated
help message, e.g. for the "mode" option:: >
-m MODE, --mode=MODE
Here, "MODE" is called the meta-variable: it stands for the argument that the
user is expected to supply to -m/--mode. By default,
optparse (|py2stdlib-optparse|) converts the destination variable name to uppercase and uses
that for the meta-variable. Sometimes, that's not what you want---for
example, the --filename option explicitly sets ``metavar="FILE"``,
resulting in this automatically-generated option description::
-f FILE, --filename=FILE
This is important for more than just saving space, though: the manually
written help text uses the meta-variable "FILE" to clue the user in that
there's a connection between the semi-formal syntax "-f FILE" and the informal
semantic description "write output to FILE". This is a simple but effective
way to make your help text a lot clearer and more useful for end users.
<
.. versionadded:: 2.4
Options that have a default value can include ``%default`` in the help
string---\ optparse (|py2stdlib-optparse|) will replace it with str of the option's
default value. If an option has no default value (or the default value is
``None``), ``%default`` expands to ``none``.
When dealing with many options, it is convenient to group these options for
better help output. An OptionParser can contain several option groups,
each of which can contain several options.
Continuing with the parser defined above, adding an OptionGroup to a
parser is easy:: >
group = OptionGroup(parser, "Dangerous Options",
"Caution: use these options at your own risk. "
"It is believed that some of them bite.")
group.add_option("-g", action="store_true", help="Group option.")
parser.add_option_group(group)
<
This would result in the following help output:
.. code-block:: text
usage: [options] arg1 arg2
options:
-h, --help show this help message and exit
-v, --verbose make lots of noise [default]
-q, --quiet be vewwy quiet (I'm hunting wabbits)
-fFILE, --file=FILE write output to FILE
-mMODE, --mode=MODE interaction mode: one of 'novice', 'intermediate'
[default], 'expert'
Dangerous Options:
Caution: use of these options is at your own risk. It is believed that
some of them bite.
-g Group option.
Printing a version string
^^^^^^^^^^^^^^^^^^^^^^^^^
Similar to the brief usage string, optparse (|py2stdlib-optparse|) can also print a version
string for your program. You have to supply the string as the ``version``
argument to OptionParser:: >
parser = OptionParser(usage="%prog [-f] [-q]", version="%prog 1.0")
<
``"%prog"`` is expanded just like it is in ``usage``. Apart from that,
``version`` can contain anything you like. When you supply it, optparse (|py2stdlib-optparse|)
automatically adds a ``"--version"`` option to your parser. If it encounters
this option on the command line, it expands your ``version`` string (by
replacing ``"%prog"``), prints it to stdout, and exits.
For example, if your script is called ``/usr/bin/foo``:: >
$ /usr/bin/foo --version
foo 1.0
<
The following two methods can be used to print and get the ``version`` string:
OptionParser.print_version(file=None)~
Print the version message for the current program (``self.version``) to
{file} (default stdout). As with print_usage, any occurrence
of ``"%prog"`` in ``self.version`` is replaced with the name of the current
program. Does nothing if ``self.version`` is empty or undefined.
OptionParser.get_version()~
Same as print_version but returns the version string instead of
printing it.
How optparse (|py2stdlib-optparse|) handles errors
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
There are two broad classes of errors that optparse (|py2stdlib-optparse|) has to worry about:
programmer errors and user errors. Programmer errors are usually erroneous
calls to OptionParser.add_option, e.g. invalid option strings, unknown
option attributes, missing option attributes, etc. These are dealt with in the
usual way: raise an exception (either optparse.OptionError or
TypeError) and let the program crash.
Handling user errors is much more important, since they are guaranteed to happen
no matter how stable your code is. optparse (|py2stdlib-optparse|) can automatically detect
some user errors, such as bad option arguments (passing ``"-n 4x"`` where
-n takes an integer argument), missing arguments (``"-n"`` at the end
of the command line, where -n takes an argument of any type). Also,
you can call OptionParser.error to signal an application-defined error
condition:: >
(options, args) = parser.parse_args()
[...]
if options.a and options.b:
parser.error("options -a and -b are mutually exclusive")
<
In either case, optparse (|py2stdlib-optparse|) handles the error the same way: it prints the
program's usage message and an error message to standard error and exits with
error status 2.
Consider the first example above, where the user passes ``"4x"`` to an option
that takes an integer:: >
$ /usr/bin/foo -n 4x
usage: foo [options]
foo: error: option -n: invalid integer value: '4x'
<
Or, where the user fails to pass a value at all::
$ /usr/bin/foo -n
usage: foo [options]
foo: error: -n option requires an argument
optparse (|py2stdlib-optparse|)\ -generated error messages take care always to mention the
option involved in the error; be sure to do the same when calling
OptionParser.error from your application code.
If optparse (|py2stdlib-optparse|)'s default error-handling behaviour does not suit your needs,
you'll need to subclass OptionParser and override its OptionParser.exit
and/or OptionParser.error methods.
Putting it all together
^^^^^^^^^^^^^^^^^^^^^^^
Here's what optparse (|py2stdlib-optparse|)\ -based scripts usually look like:: >
from optparse import OptionParser
[...]
def main():
usage = "usage: %prog [options] arg"
parser = OptionParser(usage)
parser.add_option("-f", "--file", dest="filename",
help="read data from FILENAME")
parser.add_option("-v", "--verbose",
action="store_true", dest="verbose")
parser.add_option("-q", "--quiet",
action="store_false", dest="verbose")
[...]
(options, args) = parser.parse_args()
if len(args) != 1:
parser.error("incorrect number of arguments")
if options.verbose:
print "reading %s..." % options.filename
[...]
if __name__ == "__main__":
main()
<
Reference Guide
Creating the parser
^^^^^^^^^^^^^^^^^^^
The first step in using optparse (|py2stdlib-optparse|) is to create an OptionParser instance.
OptionParser(...)~
The OptionParser constructor has no required arguments, but a number of
optional keyword arguments. You should always pass them as keyword
arguments, i.e. do not rely on the order in which the arguments are declared.
``usage`` (default: ``"%prog [options]"``)
The usage summary to print when your program is run incorrectly or with a
help option. When optparse (|py2stdlib-optparse|) prints the usage string, it expands
``%prog`` to ``os.path.basename(sys.argv[0])`` (or to ``prog`` if you
passed that keyword argument). To suppress a usage message, pass the
special value optparse.SUPPRESS_USAGE.
``option_list`` (default: ``[]``)
A list of Option objects to populate the parser with. The options in
``option_list`` are added after any options in ``standard_option_list`` (a
class attribute that may be set by OptionParser subclasses), but before
any version or help options. Deprecated; use add_option after
creating the parser instead.
``option_class`` (default: optparse.Option)
Class to use when adding options to the parser in add_option.
``version`` (default: ``None``)
A version string to print when the user supplies a version option. If you
supply a true value for ``version``, optparse (|py2stdlib-optparse|) automatically adds a
version option with the single option string ``"--version"``. The
substring ``"%prog"`` is expanded the same as for ``usage``.
``conflict_handler`` (default: ``"error"``)
Specifies what to do when options with conflicting option strings are
added to the parser; see section
optparse-conflicts-between-options.
``description`` (default: ``None``)
A paragraph of text giving a brief overview of your program.
optparse (|py2stdlib-optparse|) reformats this paragraph to fit the current terminal width
and prints it when the user requests help (after ``usage``, but before the
list of options).
``formatter`` (default: a new IndentedHelpFormatter)
An instance of optparse.HelpFormatter that will be used for printing help
text. optparse (|py2stdlib-optparse|) provides two concrete classes for this purpose:
IndentedHelpFormatter and TitledHelpFormatter.
``add_help_option`` (default: ``True``)
If true, optparse (|py2stdlib-optparse|) will add a help option (with option strings ``"-h"``
and ``"--help"``) to the parser.
``prog``
The string to use when expanding ``"%prog"`` in ``usage`` and ``version``
instead of ``os.path.basename(sys.argv[0])``.
``epilog`` (default: ``None``)
A paragraph of help text to print after the option help.
Populating the parser
^^^^^^^^^^^^^^^^^^^^^
There are several ways to populate the parser with options. The preferred way
is by using OptionParser.add_option, as shown in section
optparse-tutorial. add_option can be called in one of two ways:
* pass it an Option instance (as returned by make_option)
* pass it any combination of positional and keyword arguments that are
acceptable to make_option (i.e., to the Option constructor), and it
will create the Option instance for you
The other alternative is to pass a list of pre-constructed Option instances to
the OptionParser constructor, as in:: >
option_list = [
make_option("-f", "--filename",
action="store", type="string", dest="filename"),
make_option("-q", "--quiet",
action="store_false", dest="verbose"),
]
parser = OptionParser(option_list=option_list)
<
(make_option is a factory function for creating Option instances;
currently it is an alias for the Option constructor. A future version of
will pick the right class to instantiate. Do not instantiate Option directly.)
Defining options
^^^^^^^^^^^^^^^^
Each Option instance represents a set of synonymous command-line option strings,
e.g. -f and --file. You can specify any number of short or
long option strings, but you must specify at least one overall option string.
The canonical way to create an Option instance is with the
add_option method of OptionParser.
OptionParser.add_option(opt_str[, ...], attr=value, ...)~
To define an option with only a short option string:: >
parser.add_option("-f", attr=value, ...)
<
And to define an option with only a long option string::
parser.add_option("--foo", attr=value, ...)
The keyword arguments define attributes of the new Option object. The most
important option attribute is Option.action, and it largely
determines which other attributes are relevant or required. If you pass
irrelevant option attributes, or fail to pass required ones, optparse (|py2stdlib-optparse|)
raises an OptionError exception explaining your mistake.
An option's {action} determines what optparse (|py2stdlib-optparse|) does when it encounters
this option on the command-line. The standard option actions hard-coded into
optparse (|py2stdlib-optparse|) are:
``"store"``
store this option's argument (default)
``"store_const"``
store a constant value
``"store_true"``
store a true value
``"store_false"``
store a false value
``"append"``
append this option's argument to a list
``"append_const"``
append a constant value to a list
``"count"``
increment a counter by one
``"callback"``
call a specified function
``"help"``
print a usage message including all options and the documentation for them
(If you don't supply an action, the default is ``"store"``. For this action,
you may also supply Option.type and Option.dest option
attributes; see optparse-standard-option-actions.)
As you can see, most actions involve storing or updating a value somewhere.
optparse (|py2stdlib-optparse|) always creates a special object for this, conventionally called
``options`` (it happens to be an instance of optparse.Values). Option
arguments (and various other values) are stored as attributes of this object,
according to the Option.dest (destination) option attribute.
For example, when you call :: >
parser.parse_args()
<
one of the first things optparse (|py2stdlib-optparse|) does is create the ``options`` object::
options = Values()
If one of the options in this parser is defined with :: >
parser.add_option("-f", "--file", action="store", type="string", dest="filename")
<
and the command-line being parsed includes any of the following::
-ffoo
-f foo
--file=foo
--file foo
then optparse (|py2stdlib-optparse|), on seeing this option, will do the equivalent of :: >
options.filename = "foo"
<
The Option.type and Option.dest option attributes are almost
as important as Option.action, but Option.action is the only
one that makes sense for {all} options.
Option attributes
^^^^^^^^^^^^^^^^^
The following option attributes may be passed as keyword arguments to
OptionParser.add_option. If you pass an option attribute that is not
relevant to a particular option, or fail to pass a required option attribute,
optparse (|py2stdlib-optparse|) raises OptionError.
Option.action~
(default: ``"store"``)
Determines optparse (|py2stdlib-optparse|)'s behaviour when this option is seen on the
command line; the available options are documented :ref:`here
<optparse-standard-option-actions>`.
Option.type~
(default: ``"string"``)
The argument type expected by this option (e.g., ``"string"`` or ``"int"``);
the available option types are documented :ref:`here
<optparse-standard-option-types>`.
Option.dest~
(default: derived from option strings)
If the option's action implies writing or modifying a value somewhere, this
tells optparse (|py2stdlib-optparse|) where to write it: Option.dest names an
attribute of the ``options`` object that optparse (|py2stdlib-optparse|) builds as it parses
the command line.
Option.default~
The value to use for this option's destination if the option is not seen on
the command line. See also OptionParser.set_defaults.
Option.nargs~
(default: 1)
How many arguments of type Option.type should be consumed when this
option is seen. If > 1, optparse (|py2stdlib-optparse|) will store a tuple of values to
Option.dest.
Option.const~
For actions that store a constant value, the constant value to store.
Option.choices~
For options of type ``"choice"``, the list of strings the user may choose
from.
Option.callback~
For options with action ``"callback"``, the callable to call when this option
is seen. See section optparse-option-callbacks for detail on the
arguments passed to the callable.
Option.callback_args~
Option.callback_kwargs
Additional positional and keyword arguments to pass to ``callback`` after the
four standard callback arguments.
Option.help~
Help text to print for this option when listing all available options after
the user supplies a Option.help option (such as ``"--help"``). If
no help text is supplied, the option will be listed without help text. To
hide this option, use the special value optparse.SUPPRESS_HELP.
Option.metavar~
(default: derived from option strings)
Stand-in for the option argument(s) to use when printing help text. See
section optparse-tutorial for an example.
Standard option actions
^^^^^^^^^^^^^^^^^^^^^^^
The various option actions all have slightly different requirements and effects.
Most actions have several relevant option attributes which you may specify to
guide optparse (|py2stdlib-optparse|)'s behaviour; a few have required attributes, which you
must specify for any option using that action.
* ``"store"`` [relevant: Option.type, Option.dest,
Option.nargs, Option.choices]
The option must be followed by an argument, which is converted to a value
according to Option.type and stored in Option.dest. If
Option.nargs > 1, multiple arguments will be consumed from the
command line; all will be converted according to Option.type and
stored to Option.dest as a tuple. See the
optparse-standard-option-types section.
If Option.choices is supplied (a list or tuple of strings), the type
defaults to ``"choice"``.
If Option.type is not supplied, it defaults to ``"string"``.
If Option.dest is not supplied, optparse (|py2stdlib-optparse|) derives a destination
from the first long option string (e.g., ``"--foo-bar"`` implies
``foo_bar``). If there are no long option strings, optparse (|py2stdlib-optparse|) derives a
destination from the first short option string (e.g., ``"-f"`` implies ``f``).
Example:: >
parser.add_option("-f")
parser.add_option("-p", type="float", nargs=3, dest="point")
<
As it parses the command line ::
-f foo.txt -p 1 -3.5 4 -fbar.txt
optparse (|py2stdlib-optparse|) will set :: >
options.f = "foo.txt"
options.point = (1.0, -3.5, 4.0)
options.f = "bar.txt"
<
* ``"store_const"`` [required: Option.const; relevant:
Option.dest]
The value Option.const is stored in Option.dest.
Example:: >
parser.add_option("-q", "--quiet",
action="store_const", const=0, dest="verbose")
parser.add_option("-v", "--verbose",
action="store_const", const=1, dest="verbose")
parser.add_option("--noisy",
action="store_const", const=2, dest="verbose")
<
If ``"--noisy"`` is seen, optparse (|py2stdlib-optparse|) will set ::
options.verbose = 2
* ``"store_true"`` [relevant: Option.dest]
A special case of ``"store_const"`` that stores a true value to
Option.dest.
* ``"store_false"`` [relevant: Option.dest]
Like ``"store_true"``, but stores a false value.
Example:: >
parser.add_option("--clobber", action="store_true", dest="clobber")
parser.add_option("--no-clobber", action="store_false", dest="clobber")
<
* ``"append"`` [relevant: Option.type, Option.dest,
Option.nargs, Option.choices]
The option must be followed by an argument, which is appended to the list in
Option.dest. If no default value for Option.dest is
supplied, an empty list is automatically created when optparse (|py2stdlib-optparse|) first
encounters this option on the command-line. If Option.nargs > 1,
multiple arguments are consumed, and a tuple of length Option.nargs
is appended to Option.dest.
The defaults for Option.type and Option.dest are the same as
for the ``"store"`` action.
Example:: >
parser.add_option("-t", "--tracks", action="append", type="int")
<
If ``"-t3"`` is seen on the command-line, optparse (|py2stdlib-optparse|) does the equivalent
of:: >
options.tracks = []
options.tracks.append(int("3"))
<
If, a little later on, ``"--tracks=4"`` is seen, it does::
options.tracks.append(int("4"))
* ``"append_const"`` [required: Option.const; relevant:
Option.dest]
Like ``"store_const"``, but the value Option.const is appended to
Option.dest; as with ``"append"``, Option.dest defaults to
``None``, and an empty list is automatically created the first time the option
is encountered.
* ``"count"`` [relevant: Option.dest]
Increment the integer stored at Option.dest. If no default value is
supplied, Option.dest is set to zero before being incremented the
first time.
Example:: >
parser.add_option("-v", action="count", dest="verbosity")
<
The first time ``"-v"`` is seen on the command line, optparse (|py2stdlib-optparse|) does the
equivalent of:: >
options.verbosity = 0
options.verbosity += 1
<
Every subsequent occurrence of ``"-v"`` results in ::
options.verbosity += 1
* ``"callback"`` [required: Option.callback; relevant:
Option.type, Option.nargs, Option.callback_args,
Option.callback_kwargs]
Call the function specified by Option.callback, which is called as :: >
func(option, opt_str, value, parser, {args, }*kwargs)
<
See section optparse-option-callbacks for more detail.
* ``"help"``
Prints a complete help message for all the options in the current option
parser. The help message is constructed from the ``usage`` string passed to
OptionParser's constructor and the Option.help string passed to every
option.
If no Option.help string is supplied for an option, it will still be
listed in the help message. To omit an option entirely, use the special value
optparse.SUPPRESS_HELP.
optparse (|py2stdlib-optparse|) automatically adds a Option.help option to all
OptionParsers, so you do not normally need to create one.
Example:: >
from optparse import OptionParser, SUPPRESS_HELP
# usually, a help option is added automatically, but that can
# be suppressed using the add_help_option argument
parser = OptionParser(add_help_option=False)
parser.add_option("-h", "--help", action="help")
parser.add_option("-v", action="store_true", dest="verbose",
help="Be moderately verbose")
parser.add_option("--file", dest="filename",
help="Input file to read data from")
parser.add_option("--secret", help=SUPPRESS_HELP)
<
If optparse (|py2stdlib-optparse|) sees either ``"-h"`` or ``"--help"`` on the command line,
it will print something like the following help message to stdout (assuming
``sys.argv[0]`` is ``"foo.py"``):
.. code-block:: text
usage: foo.py [options]
options:
-h, --help Show this help message and exit
-v Be moderately verbose
--file=FILENAME Input file to read data from
After printing the help message, optparse (|py2stdlib-optparse|) terminates your process with
``sys.exit(0)``.
* ``"version"``
Prints the version number supplied to the OptionParser to stdout and exits.
The version number is actually formatted and printed by the
``print_version()`` method of OptionParser. Generally only relevant if the
``version`` argument is supplied to the OptionParser constructor. As with
Option.help options, you will rarely create ``version`` options,
since optparse (|py2stdlib-optparse|) automatically adds them when needed.
Standard option types
^^^^^^^^^^^^^^^^^^^^^
optparse (|py2stdlib-optparse|) has six built-in option types: ``"string"``, ``"int"``,
``"long"``, ``"choice"``, ``"float"`` and ``"complex"``. If you need to add new
option types, see section optparse-extending-optparse.
Arguments to string options are not checked or converted in any way: the text on
the command line is stored in the destination (or passed to the callback) as-is.
Integer arguments (type ``"int"`` or ``"long"``) are parsed as follows:
* if the number starts with ``0x``, it is parsed as a hexadecimal number
* if the number starts with ``0``, it is parsed as an octal number
* if the number starts with ``0b``, it is parsed as a binary number
* otherwise, the number is parsed as a decimal number
The conversion is done by calling either int or long with the
appropriate base (2, 8, 10, or 16). If this fails, so will optparse (|py2stdlib-optparse|),
although with a more useful error message.
``"float"`` and ``"complex"`` option arguments are converted directly with
float and complex, with similar error-handling.
``"choice"`` options are a subtype of ``"string"`` options. The
Option.choices` option attribute (a sequence of strings) defines the
set of allowed option arguments. optparse.check_choice compares
user-supplied option arguments against this master list and raises
OptionValueError if an invalid string is given.
Parsing arguments
^^^^^^^^^^^^^^^^^
The whole point of creating and populating an OptionParser is to call its
parse_args method:: >
(options, args) = parser.parse_args(args=None, values=None)
<
where the input parameters are
``args``
the list of arguments to process (default: ``sys.argv[1:]``)
``values``
object to store option arguments in (default: a new instance of
optparse.Values)
and the return values are
``options``
the same object that was passed in as ``values``, or the optparse.Values
instance created by optparse (|py2stdlib-optparse|)
``args``
the leftover positional arguments after all options have been processed
The most common usage is to supply neither keyword argument. If you supply
``values``, it will be modified with repeated setattr calls (roughly one
for every option argument stored to an option destination) and returned by
parse_args.
If parse_args encounters any errors in the argument list, it calls the
OptionParser's error method with an appropriate end-user error message.
This ultimately terminates your process with an exit status of 2 (the
traditional Unix exit status for command-line errors).
Querying and manipulating your option parser
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The default behavior of the option parser can be customized slightly, and you
can also poke around your option parser and see what's there. OptionParser
provides several methods to help you out:
OptionParser.disable_interspersed_args()~
Set parsing to stop on the first non-option. For example, if ``"-a"`` and
``"-b"`` are both simple options that take no arguments, optparse (|py2stdlib-optparse|)
normally accepts this syntax:: >
prog -a arg1 -b arg2
<
and treats it as equivalent to ::
prog -a -b arg1 arg2
To disable this feature, call disable_interspersed_args. This
restores traditional Unix syntax, where option parsing stops with the first
non-option argument.
Use this if you have a command processor which runs another command which has
options of its own and you want to make sure these options don't get
confused. For example, each command might have a different set of options.
OptionParser.enable_interspersed_args()~
Set parsing to not stop on the first non-option, allowing interspersing
switches with command arguments. This is the default behavior.
OptionParser.get_option(opt_str)~
Returns the Option instance with the option string {opt_str}, or ``None`` if
no options have that option string.
OptionParser.has_option(opt_str)~
Return true if the OptionParser has an option with option string {opt_str}
(e.g., ``"-q"`` or ``"--verbose"``).
OptionParser.remove_option(opt_str)~
If the OptionParser has an option corresponding to {opt_str}, that
option is removed. If that option provided any other option strings, all of
those option strings become invalid. If {opt_str} does not occur in any
option belonging to this OptionParser, raises ValueError.
Conflicts between options
^^^^^^^^^^^^^^^^^^^^^^^^^
If you're not careful, it's easy to define options with conflicting option
strings:: >
parser.add_option("-n", "--dry-run", ...)
[...]
parser.add_option("-n", "--noisy", ...)
<
(This is particularly true if you've defined your own OptionParser subclass with
some standard options.)
Every time you add an option, optparse (|py2stdlib-optparse|) checks for conflicts with existing
options. If it finds any, it invokes the current conflict-handling mechanism.
You can set the conflict-handling mechanism either in the constructor:: >
parser = OptionParser(..., conflict_handler=handler)
<
or with a separate call::
parser.set_conflict_handler(handler)
The available conflict handlers are:
``"error"`` (default)
assume option conflicts are a programming error and raise
OptionConflictError
``"resolve"``
resolve option conflicts intelligently (see below)
As an example, let's define an OptionParser that resolves conflicts
intelligently and add conflicting options to it:: >
parser = OptionParser(conflict_handler="resolve")
parser.add_option("-n", "--dry-run", ..., help="do no harm")
parser.add_option("-n", "--noisy", ..., help="be noisy")
<
At this point, optparse (|py2stdlib-optparse|) detects that a previously-added option is already
using the ``"-n"`` option string. Since ``conflict_handler`` is ``"resolve"``,
it resolves the situation by removing ``"-n"`` from the earlier option's list of
option strings. Now ``"--dry-run"`` is the only way for the user to activate
that option. If the user asks for help, the help message will reflect that:: >
options:
--dry-run do no harm
[...]
-n, --noisy be noisy
<
It's possible to whittle away the option strings for a previously-added option
until there are none left, and the user has no way of invoking that option from
the command-line. In that case, optparse (|py2stdlib-optparse|) removes that option completely,
so it doesn't show up in help text or anywhere else. Carrying on with our
existing OptionParser:: >
parser.add_option("--dry-run", ..., help="new dry-run option")
<
At this point, the original -n/--dry-run option is no longer
accessible, so optparse (|py2stdlib-optparse|) removes it, leaving this help text:: >
options:
[...]
-n, --noisy be noisy
--dry-run new dry-run option
<
Cleanup
OptionParser instances have several cyclic references. This should not be a
problem for Python's garbage collector, but you may wish to break the cyclic
references explicitly by calling OptionParser.destroy on your
OptionParser once you are done with it. This is particularly useful in
long-running applications where large object graphs are reachable from your
OptionParser.
Other methods
^^^^^^^^^^^^^
OptionParser supports several other public methods:
OptionParser.set_usage(usage)~
Set the usage string according to the rules described above for the ``usage``
constructor keyword argument. Passing ``None`` sets the default usage
string; use optparse.SUPPRESS_USAGE to suppress a usage message.
OptionParser.print_usage(file=None)~
Print the usage message for the current program (``self.usage``) to {file}
(default stdout). Any occurrence of the string ``"%prog"`` in ``self.usage``
is replaced with the name of the current program. Does nothing if
``self.usage`` is empty or not defined.
OptionParser.get_usage()~
Same as print_usage but returns the usage string instead of
printing it.
OptionParser.set_defaults(dest=value, ...)~
Set default values for several option destinations at once. Using
set_defaults is the preferred way to set default values for options,
since multiple options can share the same destination. For example, if
several "mode" options all set the same destination, any one of them can set
the default, and the last one wins:: >
parser.add_option("--advanced", action="store_const",
dest="mode", const="advanced",
default="novice") # overridden below
parser.add_option("--novice", action="store_const",
dest="mode", const="novice",
default="advanced") # overrides above setting
<
To avoid this confusion, use set_defaults::
parser.set_defaults(mode="advanced")
parser.add_option("--advanced", action="store_const",
dest="mode", const="advanced")
parser.add_option("--novice", action="store_const",
dest="mode", const="novice")
Option Callbacks
----------------
When optparse (|py2stdlib-optparse|)'s built-in actions and types aren't quite enough for your
needs, you have two choices: extend optparse (|py2stdlib-optparse|) or define a callback option.
Extending optparse (|py2stdlib-optparse|) is more general, but overkill for a lot of simple
cases. Quite often a simple callback is all you need.
There are two steps to defining a callback option:
* define the option itself using the ``"callback"`` action
* write the callback; this is a function (or method) that takes at least four
arguments, as described below
Defining a callback option
^^^^^^^^^^^^^^^^^^^^^^^^^^
As always, the easiest way to define a callback option is by using the
OptionParser.add_option method. Apart from Option.action, the
only option attribute you must specify is ``callback``, the function to call:: >
parser.add_option("-c", action="callback", callback=my_callback)
<
``callback`` is a function (or other callable object), so you must have already
defined ``my_callback()`` when you create this callback option. In this simple
case, optparse (|py2stdlib-optparse|) doesn't even know if -c takes any arguments,
which usually means that the option takes no arguments---the mere presence of
-c on the command-line is all it needs to know. In some
circumstances, though, you might want your callback to consume an arbitrary
number of command-line arguments. This is where writing callbacks gets tricky;
it's covered later in this section.
optparse (|py2stdlib-optparse|) always passes four particular arguments to your callback, and it
will only pass additional arguments if you specify them via
Option.callback_args and Option.callback_kwargs. Thus, the
minimal callback function signature is:: >
def my_callback(option, opt, value, parser):
<
The four arguments to a callback are described below.
There are several other option attributes that you can supply when you define a
callback option:
Option.type
has its usual meaning: as with the ``"store"`` or ``"append"`` actions, it
instructs optparse (|py2stdlib-optparse|) to consume one argument and convert it to
Option.type. Rather than storing the converted value(s) anywhere,
though, optparse (|py2stdlib-optparse|) passes it to your callback function.
Option.nargs
also has its usual meaning: if it is supplied and > 1, optparse (|py2stdlib-optparse|) will
consume Option.nargs arguments, each of which must be convertible to
Option.type. It then passes a tuple of converted values to your
callback.
Option.callback_args
a tuple of extra positional arguments to pass to the callback
Option.callback_kwargs
a dictionary of extra keyword arguments to pass to the callback
How callbacks are called
^^^^^^^^^^^^^^^^^^^^^^^^
All callbacks are called as follows:: >
func(option, opt_str, value, parser, {args, }*kwargs)
<
where
``option``
is the Option instance that's calling the callback
``opt_str``
is the option string seen on the command-line that's triggering the callback.
(If an abbreviated long option was used, ``opt_str`` will be the full,
canonical option string---e.g. if the user puts ``"--foo"`` on the
command-line as an abbreviation for ``"--foobar"``, then ``opt_str`` will be
``"--foobar"``.)
``value``
is the argument to this option seen on the command-line. optparse (|py2stdlib-optparse|) will
only expect an argument if Option.type is set; the type of ``value`` will be
the type implied by the option's type. If Option.type for this option is
``None`` (no argument expected), then ``value`` will be ``None``. If Option.nargs
> 1, ``value`` will be a tuple of values of the appropriate type.
``parser``
is the OptionParser instance driving the whole thing, mainly useful because
you can access some other interesting data through its instance attributes:
``parser.largs``
the current list of leftover arguments, ie. arguments that have been
consumed but are neither options nor option arguments. Feel free to modify
``parser.largs``, e.g. by adding more arguments to it. (This list will
become ``args``, the second return value of parse_args.)
``parser.rargs``
the current list of remaining arguments, ie. with ``opt_str`` and
``value`` (if applicable) removed, and only the arguments following them
still there. Feel free to modify ``parser.rargs``, e.g. by consuming more
arguments.
``parser.values``
the object where option values are by default stored (an instance of
optparse.OptionValues). This lets callbacks use the same mechanism as the
rest of optparse (|py2stdlib-optparse|) for storing option values; you don't need to mess
around with globals or closures. You can also access or modify the
value(s) of any options already encountered on the command-line.
``args``
is a tuple of arbitrary positional arguments supplied via the
Option.callback_args option attribute.
``kwargs``
is a dictionary of arbitrary keyword arguments supplied via
Option.callback_kwargs.
Raising errors in a callback
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The callback function should raise OptionValueError if there are any
problems with the option or its argument(s). optparse (|py2stdlib-optparse|) catches this and
terminates the program, printing the error message you supply to stderr. Your
message should be clear, concise, accurate, and mention the option at fault.
Otherwise, the user will have a hard time figuring out what he did wrong.
Callback example 1: trivial callback
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Here's an example of a callback option that takes no arguments, and simply
records that the option was seen:: >
def record_foo_seen(option, opt_str, value, parser):
parser.values.saw_foo = True
parser.add_option("--foo", action="callback", callback=record_foo_seen)
<
Of course, you could do that with the ``"store_true"`` action.
Callback example 2: check option order
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Here's a slightly more interesting example: record the fact that ``"-a"`` is
seen, but blow up if it comes after ``"-b"`` in the command-line. :: >
def check_order(option, opt_str, value, parser):
if parser.values.b:
raise OptionValueError("can't use -a after -b")
parser.values.a = 1
[...]
parser.add_option("-a", action="callback", callback=check_order)
parser.add_option("-b", action="store_true", dest="b")
<
Callback example 3: check option order (generalized)
If you want to re-use this callback for several similar options (set a flag, but
blow up if ``"-b"`` has already been seen), it needs a bit of work: the error
message and the flag that it sets must be generalized. :: >
def check_order(option, opt_str, value, parser):
if parser.values.b:
raise OptionValueError("can't use %s after -b" % opt_str)
setattr(parser.values, option.dest, 1)
[...]
parser.add_option("-a", action="callback", callback=check_order, dest='a')
parser.add_option("-b", action="store_true", dest="b")
parser.add_option("-c", action="callback", callback=check_order, dest='c')
<
Callback example 4: check arbitrary condition
Of course, you could put any condition in there---you're not limited to checking
the values of already-defined options. For example, if you have options that
should not be called when the moon is full, all you have to do is this:: >
def check_moon(option, opt_str, value, parser):
if is_moon_full():
raise OptionValueError("%s option invalid when moon is full"
% opt_str)
setattr(parser.values, option.dest, 1)
[...]
parser.add_option("--foo",
action="callback", callback=check_moon, dest="foo")
<
(The definition of ``is_moon_full()`` is left as an exercise for the reader.)
Callback example 5: fixed arguments
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Things get slightly more interesting when you define callback options that take
a fixed number of arguments. Specifying that a callback option takes arguments
is similar to defining a ``"store"`` or ``"append"`` option: if you define
Option.type, then the option takes one argument that must be
convertible to that type; if you further define Option.nargs, then the
option takes Option.nargs arguments.
Here's an example that just emulates the standard ``"store"`` action:: >
def store_value(option, opt_str, value, parser):
setattr(parser.values, option.dest, value)
[...]
parser.add_option("--foo",
action="callback", callback=store_value,
type="int", nargs=3, dest="foo")
<
Note that optparse (|py2stdlib-optparse|) takes care of consuming 3 arguments and converting
them to integers for you; all you have to do is store them. (Or whatever;
obviously you don't need a callback for this example.)
Callback example 6: variable arguments
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Things get hairy when you want an option to take a variable number of arguments.
For this case, you must write a callback, as optparse (|py2stdlib-optparse|) doesn't provide any
built-in capabilities for it. And you have to deal with certain intricacies of
conventional Unix command-line parsing that optparse (|py2stdlib-optparse|) normally handles for
you. In particular, callbacks should implement the conventional rules for bare
``"--"`` and ``"-"`` arguments:
* either ``"--"`` or ``"-"`` can be option arguments
* bare ``"--"`` (if not the argument to some option): halt command-line
processing and discard the ``"--"``
* bare ``"-"`` (if not the argument to some option): halt command-line
processing but keep the ``"-"`` (append it to ``parser.largs``)
If you want an option that takes a variable number of arguments, there are
several subtle, tricky issues to worry about. The exact implementation you
choose will be based on which trade-offs you're willing to make for your
application (which is why optparse (|py2stdlib-optparse|) doesn't support this sort of thing
directly).
Nevertheless, here's a stab at a callback for an option with variable
arguments:: >
def vararg_callback(option, opt_str, value, parser):
assert value is None
value = []
def floatable(str):
try:
float(str)
return True
except ValueError:
return False
for arg in parser.rargs:
# stop on --foo like options
if arg[:2] == "--" and len(arg) > 2:
break
# stop on -a, but not on -3 or -3.0
if arg[:1] == "-" and len(arg) > 1 and not floatable(arg):
break
value.append(arg)
del parser.rargs[:len(value)]
setattr(parser.values, option.dest, value)
[...]
parser.add_option("-c", "--callback", dest="vararg_attr",
action="callback", callback=vararg_callback)
<
Extending optparse (|py2stdlib-optparse|)
Since the two major controlling factors in how optparse (|py2stdlib-optparse|) interprets
command-line options are the action and type of each option, the most likely
direction of extension is to add new actions and new types.
Adding new types
^^^^^^^^^^^^^^^^
To add new types, you need to define your own subclass of optparse (|py2stdlib-optparse|)'s
Option class. This class has a couple of attributes that define
optparse (|py2stdlib-optparse|)'s types: Option.TYPES and Option.TYPE_CHECKER.
Option.TYPES~
A tuple of type names; in your subclass, simply define a new tuple
TYPES that builds on the standard one.
Option.TYPE_CHECKER~
A dictionary mapping type names to type-checking functions. A type-checking
function has the following signature:: >
def check_mytype(option, opt, value)
<
where ``option`` is an Option instance, ``opt`` is an option string
(e.g., ``"-f"``), and ``value`` is the string from the command line that must
be checked and converted to your desired type. ``check_mytype()`` should
return an object of the hypothetical type ``mytype``. The value returned by
a type-checking function will wind up in the OptionValues instance returned
by OptionParser.parse_args, or be passed to a callback as the
``value`` parameter.
Your type-checking function should raise OptionValueError if it
encounters any problems. OptionValueError takes a single string
argument, which is passed as-is to OptionParser's error
method, which in turn prepends the program name and the string ``"error:"``
and prints everything to stderr before terminating the process.
Here's a silly example that demonstrates adding a ``"complex"`` option type to
parse Python-style complex numbers on the command line. (This is even sillier
than it used to be, because optparse (|py2stdlib-optparse|) 1.3 added built-in support for
complex numbers, but never mind.)
First, the necessary imports:: >
from copy import copy
from optparse import Option, OptionValueError
<
You need to define your type-checker first, since it's referred to later (in the
Option.TYPE_CHECKER class attribute of your Option subclass):: >
def check_complex(option, opt, value):
try:
return complex(value)
except ValueError:
raise OptionValueError(
"option %s: invalid complex value: %r" % (opt, value))
<
Finally, the Option subclass::
class MyOption (Option):
TYPES = Option.TYPES + ("complex",)
TYPE_CHECKER = copy(Option.TYPE_CHECKER)
TYPE_CHECKER["complex"] = check_complex
(If we didn't make a copy (|py2stdlib-copy|) of Option.TYPE_CHECKER, we would end
up modifying the Option.TYPE_CHECKER attribute of optparse (|py2stdlib-optparse|)'s
Option class. This being Python, nothing stops you from doing that except good
manners and common sense.)
That's it! Now you can write a script that uses the new option type just like
any other optparse (|py2stdlib-optparse|)\ -based script, except you have to instruct your
OptionParser to use MyOption instead of Option:: >
parser = OptionParser(option_class=MyOption)
parser.add_option("-c", type="complex")
<
Alternately, you can build your own option list and pass it to OptionParser; if
you don't use add_option in the above way, you don't need to tell
OptionParser which option class to use:: >
option_list = [MyOption("-c", action="store", type="complex", dest="c")]
parser = OptionParser(option_list=option_list)
<
Adding new actions
Adding new actions is a bit trickier, because you have to understand that
optparse (|py2stdlib-optparse|) has a couple of classifications for actions:
"store" actions
actions that result in optparse (|py2stdlib-optparse|) storing a value to an attribute of the
current OptionValues instance; these options require a Option.dest
attribute to be supplied to the Option constructor.
"typed" actions
actions that take a value from the command line and expect it to be of a
certain type; or rather, a string that can be converted to a certain type.
These options require a Option.type attribute to the Option
constructor.
These are overlapping sets: some default "store" actions are ``"store"``,
``"store_const"``, ``"append"``, and ``"count"``, while the default "typed"
actions are ``"store"``, ``"append"``, and ``"callback"``.
When you add an action, you need to categorize it by listing it in at least one
of the following class attributes of Option (all are lists of strings):
Option.ACTIONS~
All actions must be listed in ACTIONS.
Option.STORE_ACTIONS~
"store" actions are additionally listed here.
Option.TYPED_ACTIONS~
"typed" actions are additionally listed here.
Option.ALWAYS_TYPED_ACTIONS~
Actions that always take a type (i.e. whose options always take a value) are
additionally listed here. The only effect of this is that optparse (|py2stdlib-optparse|)
assigns the default type, ``"string"``, to options with no explicit type
whose action is listed in ALWAYS_TYPED_ACTIONS.
In order to actually implement your new action, you must override Option's
take_action method and add a case that recognizes your action.
For example, let's add an ``"extend"`` action. This is similar to the standard
``"append"`` action, but instead of taking a single value from the command-line
and appending it to an existing list, ``"extend"`` will take multiple values in
a single comma-delimited string, and extend an existing list with them. That
is, if ``"--names"`` is an ``"extend"`` option of type ``"string"``, the command
line :: >
--names=foo,bar --names blah --names ding,dong
<
would result in a list ::
["foo", "bar", "blah", "ding", "dong"]
Again we define a subclass of Option:: >
class MyOption(Option):
ACTIONS = Option.ACTIONS + ("extend",)
STORE_ACTIONS = Option.STORE_ACTIONS + ("extend",)
TYPED_ACTIONS = Option.TYPED_ACTIONS + ("extend",)
ALWAYS_TYPED_ACTIONS = Option.ALWAYS_TYPED_ACTIONS + ("extend",)
def take_action(self, action, dest, opt, value, values, parser):
if action == "extend":
lvalue = value.split(",")
values.ensure_value(dest, []).extend(lvalue)
else:
Option.take_action(
self, action, dest, opt, value, values, parser)
<
Features of note:
* ``"extend"`` both expects a value on the command-line and stores that value
somewhere, so it goes in both Option.STORE_ACTIONS and
Option.TYPED_ACTIONS.
* to ensure that optparse (|py2stdlib-optparse|) assigns the default type of ``"string"`` to
``"extend"`` actions, we put the ``"extend"`` action in
Option.ALWAYS_TYPED_ACTIONS as well.
* MyOption.take_action implements just this one new action, and passes
control back to Option.take_action for the standard optparse (|py2stdlib-optparse|)
actions.
* ``values`` is an instance of the optparse_parser.Values class, which provides
the very useful ensure_value method. ensure_value is
essentially getattr with a safety valve; it is called as :: >
values.ensure_value(attr, value)
If the ``attr`` attribute of ``values`` doesn't exist or is None, then
ensure_value() first sets it to ``value``, and then returns 'value. This is
very handy for actions like ``"extend"``, ``"append"``, and ``"count"``, all
of which accumulate data in a variable and expect that variable to be of a
certain type (a list for the first two, an integer for the latter). Using
ensure_value means that scripts using your action don't have to worry
about setting a default value for the option destinations in question; they
can just leave the default as None and ensure_value will take care of
getting it right when it's needed.
==============================================================================
*py2stdlib-os.path*
os.path~
:synopsis: Operations on pathnames.
.. index:: single: path; operations
This module implements some useful functions on pathnames. To read or
write files see open, and for accessing the filesystem see the
os (|py2stdlib-os|) module.
.. note::
On Windows, many of these functions do not properly support UNC pathnames.
splitunc and ismount do handle them correctly.
.. note::
Since different operating systems have different path name conventions, there
are several versions of this module in the standard library. The
os.path (|py2stdlib-os.path|) module is always the path module suitable for the operating
system Python is running on, and therefore usable for local paths. However,
you can also import and use the individual modules if you want to manipulate
a path that is {always} in one of the different formats. They all have the
same interface:
* posixpath for UNIX-style paths
* ntpath for Windows paths
* macpath (|py2stdlib-macpath|) for old-style MacOS paths
* os2emxpath for OS/2 EMX paths
abspath(path)~
Return a normalized absolutized version of the pathname {path}. On most
platforms, this is equivalent to ``normpath(join(os.getcwd(), path))``.
.. versionadded:: 1.5.2
basename(path)~
Return the base name of pathname {path}. This is the second half of the pair
returned by ``split(path)``. Note that the result of this function is different
from the Unix basename program; where basename for
``'/foo/bar/'`` returns ``'bar'``, the basename function returns an
empty string (``''``).
commonprefix(list)~
Return the longest path prefix (taken character-by-character) that is a prefix
of all paths in {list}. If {list} is empty, return the empty string (``''``).
Note that this may return invalid paths because it works a character at a time.
dirname(path)~
Return the directory name of pathname {path}. This is the first half of the
pair returned by ``split(path)``.
exists(path)~
Return ``True`` if {path} refers to an existing path. Returns ``False`` for
broken symbolic links. On some platforms, this function may return ``False`` if
permission is not granted to execute os.stat on the requested file, even
if the {path} physically exists.
lexists(path)~
Return ``True`` if {path} refers to an existing path. Returns ``True`` for
broken symbolic links. Equivalent to exists on platforms lacking
os.lstat.
.. versionadded:: 2.4
expanduser(path)~
On Unix and Windows, return the argument with an initial component of ``~`` or
``~user`` replaced by that {user}'s home directory.
.. index:: module: pwd
On Unix, an initial ``~`` is replaced by the environment variable HOME
if it is set; otherwise the current user's home directory is looked up in the
password directory through the built-in module pwd (|py2stdlib-pwd|). An initial ``~user``
is looked up directly in the password directory.
On Windows, HOME and USERPROFILE will be used if set,
otherwise a combination of HOMEPATH and HOMEDRIVE will be
used. An initial ``~user`` is handled by stripping the last directory component
from the created user path derived above.
If the expansion fails or if the path does not begin with a tilde, the path is
returned unchanged.
expandvars(path)~
Return the argument with environment variables expanded. Substrings of the form
``$name`` or ``${name}`` are replaced by the value of environment variable
{name}. Malformed variable names and references to non-existing variables are
left unchanged.
On Windows, ``%name%`` expansions are supported in addition to ``$name`` and
``${name}``.
getatime(path)~
Return the time of last access of {path}. The return value is a number giving
the number of seconds since the epoch (see the time (|py2stdlib-time|) module). Raise
os.error if the file does not exist or is inaccessible.
.. versionadded:: 1.5.2
.. versionchanged:: 2.3
If os.stat_float_times returns True, the result is a floating point
number.
getmtime(path)~
Return the time of last modification of {path}. The return value is a number
giving the number of seconds since the epoch (see the time (|py2stdlib-time|) module).
Raise os.error if the file does not exist or is inaccessible.
.. versionadded:: 1.5.2
.. versionchanged:: 2.3
If os.stat_float_times returns True, the result is a floating point
number.
getctime(path)~
Return the system's ctime which, on some systems (like Unix) is the time of the
last change, and, on others (like Windows), is the creation time for {path}.
The return value is a number giving the number of seconds since the epoch (see
the time (|py2stdlib-time|) module). Raise os.error if the file does not exist or
is inaccessible.
.. versionadded:: 2.3
getsize(path)~
Return the size, in bytes, of {path}. Raise os.error if the file does
not exist or is inaccessible.
.. versionadded:: 1.5.2
isabs(path)~
Return ``True`` if {path} is an absolute pathname. On Unix, that means it
begins with a slash, on Windows that it begins with a (back)slash after chopping
off a potential drive letter.
isfile(path)~
Return ``True`` if {path} is an existing regular file. This follows symbolic
links, so both islink and isfile can be true for the same path.
isdir(path)~
Return ``True`` if {path} is an existing directory. This follows symbolic
links, so both islink and isdir can be true for the same path.
islink(path)~
Return ``True`` if {path} refers to a directory entry that is a symbolic link.
Always ``False`` if symbolic links are not supported.
ismount(path)~
Return ``True`` if pathname {path} is a mount point: a point in a file
system where a different file system has been mounted. The function checks
whether {path}'s parent, path/.., is on a different device than {path},
or whether path/.. and {path} point to the same i-node on the same
device --- this should detect mount points for all Unix and POSIX variants.
join(path1[, path2[, ...]])~
Join one or more path components intelligently. If any component is an absolute
path, all previous components (on Windows, including the previous drive letter,
if there was one) are thrown away, and joining continues. The return value is
the concatenation of {path1}, and optionally {path2}, etc., with exactly one
directory separator (``os.sep``) inserted between components, unless {path2} is
empty. Note that on Windows, since there is a current directory for each drive,
``os.path.join("c:", "foo")`` represents a path relative to the current
directory on drive C: (c:foo), not c:\\foo.
normcase(path)~
Normalize the case of a pathname. On Unix and Mac OS X, this returns the
path unchanged; on case-insensitive filesystems, it converts the path to
lowercase. On Windows, it also converts forward slashes to backward slashes.
normpath(path)~
Normalize a pathname. This collapses redundant separators and up-level
references so that ``A//B``, ``A/./B`` and ``A/foo/../B`` all become ``A/B``.
It does not normalize the case (use normcase for that). On Windows, it
converts forward slashes to backward slashes. It should be understood that this
may change the meaning of the path if it contains symbolic links!
realpath(path)~
Return the canonical path of the specified filename, eliminating any symbolic
links encountered in the path (if they are supported by the operating system).
.. versionadded:: 2.2
relpath(path[, start])~
Return a relative filepath to {path} either from the current directory or from
an optional {start} point.
{start} defaults to os.curdir.
Availability: Windows, Unix.
.. versionadded:: 2.6
samefile(path1, path2)~
Return ``True`` if both pathname arguments refer to the same file or directory
(as indicated by device number and i-node number). Raise an exception if a
os.stat call on either pathname fails.
Availability: Unix.
sameopenfile(fp1, fp2)~
Return ``True`` if the file descriptors {fp1} and {fp2} refer to the same file.
Availability: Unix.
samestat(stat1, stat2)~
Return ``True`` if the stat tuples {stat1} and {stat2} refer to the same file.
These structures may have been returned by fstat, lstat, or
stat (|py2stdlib-stat|). This function implements the underlying comparison used by
samefile and sameopenfile.
Availability: Unix.
split(path)~
Split the pathname {path} into a pair, ``(head, tail)`` where {tail} is the last
pathname component and {head} is everything leading up to that. The {tail} part
will never contain a slash; if {path} ends in a slash, {tail} will be empty. If
there is no slash in {path}, {head} will be empty. If {path} is empty, both
{head} and {tail} are empty. Trailing slashes are stripped from {head} unless
it is the root (one or more slashes only). In nearly all cases, ``join(head,
tail)`` equals {path} (the only exception being when there were multiple slashes
separating {head} from {tail}).
splitdrive(path)~
Split the pathname {path} into a pair ``(drive, tail)`` where {drive} is either
a drive specification or the empty string. On systems which do not use drive
specifications, {drive} will always be the empty string. In all cases, ``drive
+ tail`` will be the same as {path}.
.. versionadded:: 1.3
splitext(path)~
Split the pathname {path} into a pair ``(root, ext)`` such that ``root + ext ==
path``, and {ext} is empty or begins with a period and contains at most one
period. Leading periods on the basename are ignored; ``splitext('.cshrc')``
returns ``('.cshrc', '')``.
.. versionchanged:: 2.6
Earlier versions could produce an empty root when the only period was the
first character.
splitunc(path)~
Split the pathname {path} into a pair ``(unc, rest)`` so that {unc} is the UNC
mount point (such as ``r'\\host\mount'``), if present, and {rest} the rest of
the path (such as ``r'\path\file.ext'``). For paths containing drive letters,
{unc} will always be the empty string.
Availability: Windows.
walk(path, visit, arg)~
Calls the function {visit} with arguments ``(arg, dirname, names)`` for each
directory in the directory tree rooted at {path} (including {path} itself, if it
is a directory). The argument {dirname} specifies the visited directory, the
argument {names} lists the files in the directory (gotten from
``os.listdir(dirname)``). The {visit} function may modify {names} to influence
the set of directories visited below {dirname}, e.g. to avoid visiting certain
parts of the tree. (The object referred to by {names} must be modified in
place, using del or slice assignment.)
.. note:: >
Symbolic links to directories are not treated as subdirectories, and that
walk therefore will not visit them. To visit linked directories you must
identify them with ``os.path.islink(file)`` and ``os.path.isdir(file)``, and
invoke walk as necessary.
<
.. note::
This function is deprecated and has been removed in 3.0 in favor of
os.walk.
supports_unicode_filenames~
True if arbitrary Unicode strings can be used as file names (within limitations
imposed by the file system), and if os.listdir returns Unicode strings
for a Unicode argument.
.. versionadded:: 2.3
==============================================================================
*py2stdlib-os*
os~
:synopsis: Miscellaneous operating system interfaces.
This module provides a portable way of using operating system dependent
functionality. If you just want to read or write a file see open, if
you want to manipulate paths, see the os.path (|py2stdlib-os.path|) module, and if you want to
read all the lines in all the files on the command line see the fileinput (|py2stdlib-fileinput|)
module. For creating temporary files and directories see the tempfile (|py2stdlib-tempfile|)
module, and for high-level file and directory handling see the shutil (|py2stdlib-shutil|)
module.
Notes on the availability of these functions:
* The design of all built-in operating system dependent modules of Python is
such that as long as the same functionality is available, it uses the same
interface; for example, the function ``os.stat(path)`` returns stat
information about {path} in the same format (which happens to have originated
with the POSIX interface).
* Extensions peculiar to a particular operating system are also available
through the os (|py2stdlib-os|) module, but using them is of course a threat to
portability.
* An "Availability: Unix" note means that this function is commonly found on
Unix systems. It does not make any claims about its existence on a specific
operating system.
* If not separately noted, all functions that claim "Availability: Unix" are
supported on Mac OS X, which builds on a Unix core.
.. Availability notes get their own line and occur at the end of the function
.. documentation.
.. note::
All functions in this module raise OSError in the case of invalid or
inaccessible file names and paths, or other arguments that have the correct
type, but are not accepted by the operating system.
error~
An alias for the built-in OSError exception.
name~
The name of the operating system dependent module imported. The following
names have currently been registered: ``'posix'``, ``'nt'``,
``'os2'``, ``'ce'``, ``'java'``, ``'riscos'``.
Process Parameters
------------------
These functions and data items provide information and operate on the current
process and user.
environ~
A mapping object representing the string environment. For example,
``environ['HOME']`` is the pathname of your home directory (on some platforms),
and is equivalent to ``getenv("HOME")`` in C.
This mapping is captured the first time the os (|py2stdlib-os|) module is imported,
typically during Python startup as part of processing site.py. Changes
to the environment made after this time are not reflected in ``os.environ``,
except for changes made by modifying ``os.environ`` directly.
If the platform supports the putenv function, this mapping may be used
to modify the environment as well as query the environment. putenv will
be called automatically when the mapping is modified.
.. note:: >
Calling putenv directly does not change ``os.environ``, so it's better
to modify ``os.environ``.
<
.. note::
On some platforms, including FreeBSD and Mac OS X, setting ``environ`` may
cause memory leaks. Refer to the system documentation for
putenv.
If putenv is not provided, a modified copy of this mapping may be
passed to the appropriate process-creation functions to cause child processes
to use a modified environment.
If the platform supports the unsetenv function, you can delete items in
this mapping to unset environment variables. unsetenv will be called
automatically when an item is deleted from ``os.environ``, and when
one of the pop or clear methods is called.
.. versionchanged:: 2.6
Also unset environment variables when calling os.environ.clear
and os.environ.pop.
chdir(path)~
fchdir(fd)
getcwd()
These functions are described in os-file-dir.
ctermid()~
Return the filename corresponding to the controlling terminal of the process.
Availability: Unix.
getegid()~
Return the effective group id of the current process. This corresponds to the
"set id" bit on the file being executed in the current process.
Availability: Unix.
geteuid()~
.. index:: single: user; effective id
Return the current process's effective user id.
Availability: Unix.
getgid()~
.. index:: single: process; group
Return the real group id of the current process.
Availability: Unix.
getgroups()~
Return list of supplemental group ids associated with the current process.
Availability: Unix.
initgroups(username, gid)~
Call the system initgroups() to initialize the group access list with all of
the groups of which the specified username is a member, plus the specified
group id.
Availability: Unix.
.. versionadded:: 2.7
getlogin()~
Return the name of the user logged in on the controlling terminal of the
process. For most purposes, it is more useful to use the environment variable
LOGNAME to find out who the user is, or
``pwd.getpwuid(os.getuid())[0]`` to get the login name of the currently
effective user id.
Availability: Unix.
getpgid(pid)~
Return the process group id of the process with process id {pid}. If {pid} is 0,
the process group id of the current process is returned.
Availability: Unix.
.. versionadded:: 2.3
getpgrp()~
.. index:: single: process; group
Return the id of the current process group.
Availability: Unix.
getpid()~
.. index:: single: process; id
Return the current process id.
Availability: Unix, Windows.
getppid()~
.. index:: single: process; id of parent
Return the parent's process id.
Availability: Unix.
getresuid()~
Return a tuple (ruid, euid, suid) denoting the current process's
real, effective, and saved user ids.
Availability: Unix.
.. versionadded:: 2.7
getresgid()~
Return a tuple (rgid, egid, sgid) denoting the current process's
real, effective, and saved user ids.
Availability: Unix.
.. versionadded:: 2.7
getuid()~
.. index:: single: user; id
Return the current process's user id.
Availability: Unix.
getenv(varname[, value])~
Return the value of the environment variable {varname} if it exists, or {value}
if it doesn't. {value} defaults to ``None``.
Availability: most flavors of Unix, Windows.
putenv(varname, value)~
.. index:: single: environment variables; setting
Set the environment variable named {varname} to the string {value}. Such
changes to the environment affect subprocesses started with os.system,
popen or fork and execv.
Availability: most flavors of Unix, Windows.
.. note:: >
On some platforms, including FreeBSD and Mac OS X, setting ``environ`` may
cause memory leaks. Refer to the system documentation for putenv.
<
When putenv is supported, assignments to items in ``os.environ`` are
automatically translated into corresponding calls to putenv; however,
calls to putenv don't update ``os.environ``, so it is actually
preferable to assign to items of ``os.environ``.
setegid(egid)~
Set the current process's effective group id.
Availability: Unix.
seteuid(euid)~
Set the current process's effective user id.
Availability: Unix.
setgid(gid)~
Set the current process' group id.
Availability: Unix.
setgroups(groups)~
Set the list of supplemental group ids associated with the current process to
{groups}. {groups} must be a sequence, and each element must be an integer
identifying a group. This operation is typically available only to the superuser.
Availability: Unix.
.. versionadded:: 2.2
setpgrp()~
Call the system call setpgrp or setpgrp(0, 0) depending on
which version is implemented (if any). See the Unix manual for the semantics.
Availability: Unix.
setpgid(pid, pgrp)~
Call the system call setpgid to set the process group id of the
process with id {pid} to the process group with id {pgrp}. See the Unix manual
for the semantics.
Availability: Unix.
setregid(rgid, egid)~
Set the current process's real and effective group ids.
Availability: Unix.
setresgid(rgid, egid, sgid)~
Set the current process's real, effective, and saved group ids.
Availability: Unix.
.. versionadded:: 2.7
setresuid(ruid, euid, suid)~
Set the current process's real, effective, and saved user ids.
Availibility: Unix.
.. versionadded:: 2.7
setreuid(ruid, euid)~
Set the current process's real and effective user ids.
Availability: Unix.
getsid(pid)~
Call the system call getsid. See the Unix manual for the semantics.
Availability: Unix.
.. versionadded:: 2.4
setsid()~
Call the system call setsid. See the Unix manual for the semantics.
Availability: Unix.
setuid(uid)~
.. index:: single: user; id, setting
Set the current process's user id.
Availability: Unix.
.. placed in this section since it relates to errno.... a little weak
strerror(code)~
Return the error message corresponding to the error code in {code}.
On platforms where strerror returns ``NULL`` when given an unknown
error number, ValueError is raised.
Availability: Unix, Windows.
umask(mask)~
Set the current numeric umask and return the previous umask.
Availability: Unix, Windows.
uname()~
.. index::
single: gethostname() (in module socket)
single: gethostbyaddr() (in module socket)
Return a 5-tuple containing information identifying the current operating
system. The tuple contains 5 strings: ``(sysname, nodename, release, version,
machine)``. Some systems truncate the nodename to 8 characters or to the
leading component; a better way to get the hostname is
socket.gethostname or even
``socket.gethostbyaddr(socket.gethostname())``.
Availability: recent flavors of Unix.
unsetenv(varname)~
.. index:: single: environment variables; deleting
Unset (delete) the environment variable named {varname}. Such changes to the
environment affect subprocesses started with os.system, popen or
fork and execv.
When unsetenv is supported, deletion of items in ``os.environ`` is
automatically translated into a corresponding call to unsetenv; however,
calls to unsetenv don't update ``os.environ``, so it is actually
preferable to delete items of ``os.environ``.
Availability: most flavors of Unix, Windows.
File Object Creation
--------------------
These functions create new file objects. (See also open.)
fdopen(fd[, mode[, bufsize]])~
.. index:: single: I/O control; buffering
Return an open file object connected to the file descriptor {fd}. The {mode}
and {bufsize} arguments have the same meaning as the corresponding arguments to
the built-in open function.
Availability: Unix, Windows.
.. versionchanged:: 2.3
When specified, the {mode} argument must now start with one of the letters
``'r'``, ``'w'``, or ``'a'``, otherwise a ValueError is raised.
.. versionchanged:: 2.5
On Unix, when the {mode} argument starts with ``'a'``, the {O_APPEND} flag is
set on the file descriptor (which the fdopen implementation already
does on most platforms).
popen(command[, mode[, bufsize]])~
Open a pipe to or from {command}. The return value is an open file object
connected to the pipe, which can be read or written depending on whether {mode}
is ``'r'`` (default) or ``'w'``. The {bufsize} argument has the same meaning as
the corresponding argument to the built-in open function. The exit
status of the command (encoded in the format specified for wait) is
available as the return value of the file.close method of the file object,
except that when the exit status is zero (termination without errors), ``None``
is returned.
Availability: Unix, Windows.
2.6~
This function is obsolete. Use the subprocess (|py2stdlib-subprocess|) module. Check
especially the subprocess-replacements section.
.. versionchanged:: 2.0
This function worked unreliably under Windows in earlier versions of Python.
This was due to the use of the _popen function from the libraries
provided with Windows. Newer versions of Python do not use the broken
implementation from the Windows libraries.
tmpfile()~
Return a new file object opened in update mode (``w+b``). The file has no
directory entries associated with it and will be automatically deleted once
there are no file descriptors for the file.
Availability: Unix, Windows.
There are a number of different popen\* functions that provide slightly
different ways to create subprocesses.
2.6~
All of the popen\* functions are obsolete. Use the subprocess (|py2stdlib-subprocess|)
module.
For each of the popen\{ variants, if }bufsize* is specified, it
specifies the buffer size for the I/O pipes. {mode}, if provided, should be the
string ``'b'`` or ``'t'``; on Windows this is needed to determine whether the
file objects should be opened in binary or text mode. The default value for
{mode} is ``'t'``.
Also, for each of these variants, on Unix, {cmd} may be a sequence, in which
case arguments will be passed directly to the program without shell intervention
(as with os.spawnv). If {cmd} is a string it will be passed to the shell
(as with os.system).
These methods do not make it possible to retrieve the exit status from the child
processes. The only way to control the input and output streams and also
retrieve the return codes is to use the subprocess (|py2stdlib-subprocess|) module; these are only
available on Unix.
For a discussion of possible deadlock conditions related to the use of these
functions, see popen2-flow-control.
popen2(cmd[, mode[, bufsize]])~
Execute {cmd} as a sub-process and return the file objects ``(child_stdin,
child_stdout)``.
2.6~
This function is obsolete. Use the subprocess (|py2stdlib-subprocess|) module. Check
especially the subprocess-replacements section.
Availability: Unix, Windows.
.. versionadded:: 2.0
popen3(cmd[, mode[, bufsize]])~
Execute {cmd} as a sub-process and return the file objects ``(child_stdin,
child_stdout, child_stderr)``.
2.6~
This function is obsolete. Use the subprocess (|py2stdlib-subprocess|) module. Check
especially the subprocess-replacements section.
Availability: Unix, Windows.
.. versionadded:: 2.0
popen4(cmd[, mode[, bufsize]])~
Execute {cmd} as a sub-process and return the file objects ``(child_stdin,
child_stdout_and_stderr)``.
2.6~
This function is obsolete. Use the subprocess (|py2stdlib-subprocess|) module. Check
especially the subprocess-replacements section.
Availability: Unix, Windows.
.. versionadded:: 2.0
(Note that ``child_stdin, child_stdout, and child_stderr`` are named from the
point of view of the child process, so {child_stdin} is the child's standard
input.)
This functionality is also available in the popen2 (|py2stdlib-popen2|) module using functions
of the same names, but the return values of those functions have a different
order.
File Descriptor Operations
--------------------------
These functions operate on I/O streams referenced using file descriptors.
File descriptors are small integers corresponding to a file that has been opened
by the current process. For example, standard input is usually file descriptor
0, standard output is 1, and standard error is 2. Further files opened by a
process will then be assigned 3, 4, 5, and so forth. The name "file descriptor"
is slightly deceptive; on Unix platforms, sockets and pipes are also referenced
by file descriptors.
The file.fileno method can be used to obtain the file descriptor
associated with a file object when required. Note that using the file
descriptor directly will bypass the file object methods, ignoring aspects such
as internal buffering of data.
close(fd)~
Close file descriptor {fd}.
Availability: Unix, Windows.
.. note:: >
This function is intended for low-level I/O and must be applied to a file
descriptor as returned by os.open or pipe. To close a "file
object" returned by the built-in function open or by popen or
fdopen, use its file.close method.
<
closerange(fd_low, fd_high)~
Close all file descriptors from {fd_low} (inclusive) to {fd_high} (exclusive),
ignoring errors. Equivalent to:: >
for fd in xrange(fd_low, fd_high):
try:
os.close(fd)
except OSError:
pass
<
Availability: Unix, Windows.
.. versionadded:: 2.6
dup(fd)~
Return a duplicate of file descriptor {fd}.
Availability: Unix, Windows.
dup2(fd, fd2)~
Duplicate file descriptor {fd} to {fd2}, closing the latter first if necessary.
Availability: Unix, Windows.
fchmod(fd, mode)~
Change the mode of the file given by {fd} to the numeric {mode}. See the docs
for chmod for possible values of {mode}.
Availability: Unix.
.. versionadded:: 2.6
fchown(fd, uid, gid)~
Change the owner and group id of the file given by {fd} to the numeric {uid}
and {gid}. To leave one of the ids unchanged, set it to -1.
Availability: Unix.
.. versionadded:: 2.6
fdatasync(fd)~
Force write of file with filedescriptor {fd} to disk. Does not force update of
metadata.
Availability: Unix.
.. note::
This function is not available on MacOS.
fpathconf(fd, name)~
Return system configuration information relevant to an open file. {name}
specifies the configuration value to retrieve; it may be a string which is the
name of a defined system value; these names are specified in a number of
standards (POSIX.1, Unix 95, Unix 98, and others). Some platforms define
additional names as well. The names known to the host operating system are
given in the ``pathconf_names`` dictionary. For configuration variables not
included in that mapping, passing an integer for {name} is also accepted.
If {name} is a string and is not known, ValueError is raised. If a
specific value for {name} is not supported by the host system, even if it is
included in ``pathconf_names``, an OSError is raised with
errno.EINVAL for the error number.
Availability: Unix.
fstat(fd)~
Return status for file descriptor {fd}, like stat (|py2stdlib-stat|).
Availability: Unix, Windows.
fstatvfs(fd)~
Return information about the filesystem containing the file associated with file
descriptor {fd}, like statvfs (|py2stdlib-statvfs|).
Availability: Unix.
fsync(fd)~
Force write of file with filedescriptor {fd} to disk. On Unix, this calls the
native fsync function; on Windows, the MS _commit function.
If you're starting with a Python file object {f}, first do ``f.flush()``, and
then do ``os.fsync(f.fileno())``, to ensure that all internal buffers associated
with {f} are written to disk.
Availability: Unix, and Windows starting in 2.2.3.
ftruncate(fd, length)~
Truncate the file corresponding to file descriptor {fd}, so that it is at most
{length} bytes in size.
Availability: Unix.
isatty(fd)~
Return ``True`` if the file descriptor {fd} is open and connected to a
tty(-like) device, else ``False``.
Availability: Unix.
lseek(fd, pos, how)~
Set the current position of file descriptor {fd} to position {pos}, modified
by {how}: SEEK_SET or ``0`` to set the position relative to the
beginning of the file; SEEK_CUR or ``1`` to set it relative to the
current position; os.SEEK_END or ``2`` to set it relative to the end of
the file.
Availability: Unix, Windows.
SEEK_SET~
SEEK_CUR
SEEK_END
Parameters to the lseek function. Their values are 0, 1, and 2,
respectively.
Availability: Windows, Unix.
.. versionadded:: 2.5
open(file, flags[, mode])~
Open the file {file} and set various flags according to {flags} and possibly its
mode according to {mode}. The default {mode} is ``0777`` (octal), and the
current umask value is first masked out. Return the file descriptor for the
newly opened file.
For a description of the flag and mode values, see the C run-time documentation;
flag constants (like O_RDONLY and O_WRONLY) are defined in
this module too (see open-constants). In particular, on Windows adding
O_BINARY is needed to open files in binary mode.
Availability: Unix, Windows.
.. note:: >
This function is intended for low-level I/O. For normal usage, use the
built-in function open, which returns a "file object" with
file.read and file.wprite methods (and many more). To
wrap a file descriptor in a "file object", use fdopen.
<
openpty()~
.. index:: module: pty
Open a new pseudo-terminal pair. Return a pair of file descriptors ``(master,
slave)`` for the pty and the tty, respectively. For a (slightly) more portable
approach, use the pty (|py2stdlib-pty|) module.
Availability: some flavors of Unix.
pipe()~
Create a pipe. Return a pair of file descriptors ``(r, w)`` usable for reading
and writing, respectively.
Availability: Unix, Windows.
read(fd, n)~
Read at most {n} bytes from file descriptor {fd}. Return a string containing the
bytes read. If the end of the file referred to by {fd} has been reached, an
empty string is returned.
Availability: Unix, Windows.
.. note:: >
This function is intended for low-level I/O and must be applied to a file
descriptor as returned by os.open or pipe. To read a "file object"
returned by the built-in function open or by popen or
fdopen, or sys.stdin, use its file.read or
file.readline methods.
<
tcgetpgrp(fd)~
Return the process group associated with the terminal given by {fd} (an open
file descriptor as returned by os.open).
Availability: Unix.
tcsetpgrp(fd, pg)~
Set the process group associated with the terminal given by {fd} (an open file
descriptor as returned by os.open) to {pg}.
Availability: Unix.
ttyname(fd)~
Return a string which specifies the terminal device associated with
file descriptor {fd}. If {fd} is not associated with a terminal device, an
exception is raised.
Availability: Unix.
write(fd, str)~
Write the string {str} to file descriptor {fd}. Return the number of bytes
actually written.
Availability: Unix, Windows.
.. note:: >
This function is intended for low-level I/O and must be applied to a file
descriptor as returned by os.open or pipe. To write a "file
object" returned by the built-in function open or by popen or
fdopen, or sys.stdout or sys.stderr, use its
file.write method.
<
``open()`` flag constants
The following constants are options for the {flags} parameter to the
os.open function. They can be combined using the bitwise OR operator
``|``. Some of them are not available on all platforms. For descriptions of
their availability and use, consult the open(2) manual page on Unix
or `the MSDN <http://msdn.microsoft.com/en-us/library/z0kc8e3z.aspx>`_ on Windows.
O_RDONLY~
O_WRONLY
O_RDWR
O_APPEND
O_CREAT
O_EXCL
O_TRUNC
These constants are available on Unix and Windows.
O_DSYNC~
O_RSYNC
O_SYNC
O_NDELAY
O_NONBLOCK
O_NOCTTY
O_SHLOCK
O_EXLOCK
These constants are only available on Unix.
O_BINARY~
O_NOINHERIT
O_SHORT_LIVED
O_TEMPORARY
O_RANDOM
O_SEQUENTIAL
O_TEXT
These constants are only available on Windows.
O_ASYNC~
O_DIRECT
O_DIRECTORY
O_NOFOLLOW
O_NOATIME
These constants are GNU extensions and not present if they are not defined by
the C library.
Files and Directories
---------------------
access(path, mode)~
Use the real uid/gid to test for access to {path}. Note that most operations
will use the effective uid/gid, therefore this routine can be used in a
suid/sgid environment to test if the invoking user has the specified access to
{path}. {mode} should be F_OK to test the existence of {path}, or it
can be the inclusive OR of one or more of R_OK, W_OK, and
X_OK to test permissions. Return True if access is allowed,
False if not. See the Unix man page access(2) for more
information.
Availability: Unix, Windows.
.. note:: >
Using access to check if a user is authorized to e.g. open a file
before actually doing so using open creates a security hole,
because the user might exploit the short time interval between checking
and opening the file to manipulate it.
<
.. note::
I/O operations may fail even when access indicates that they would
succeed, particularly for operations on network filesystems which may have
permissions semantics beyond the usual POSIX permission-bit model.
F_OK~
Value to pass as the {mode} parameter of access to test the existence of
{path}.
R_OK~
Value to include in the {mode} parameter of access to test the
readability of {path}.
W_OK~
Value to include in the {mode} parameter of access to test the
writability of {path}.
X_OK~
Value to include in the {mode} parameter of access to determine if
{path} can be executed.
chdir(path)~
.. index:: single: directory; changing
Change the current working directory to {path}.
Availability: Unix, Windows.
fchdir(fd)~
Change the current working directory to the directory represented by the file
descriptor {fd}. The descriptor must refer to an opened directory, not an open
file.
Availability: Unix.
.. versionadded:: 2.3
getcwd()~
Return a string representing the current working directory.
Availability: Unix, Windows.
getcwdu()~
Return a Unicode object representing the current working directory.
Availability: Unix, Windows.
.. versionadded:: 2.3
chflags(path, flags)~
Set the flags of {path} to the numeric {flags}. {flags} may take a combination
(bitwise OR) of the following values (as defined in the stat (|py2stdlib-stat|) module):
* ``UF_NODUMP``
* ``UF_IMMUTABLE``
* ``UF_APPEND``
* ``UF_OPAQUE``
* ``UF_NOUNLINK``
* ``SF_ARCHIVED``
* ``SF_IMMUTABLE``
* ``SF_APPEND``
* ``SF_NOUNLINK``
* ``SF_SNAPSHOT``
Availability: Unix.
.. versionadded:: 2.6
chroot(path)~
Change the root directory of the current process to {path}. Availability:
Unix.
.. versionadded:: 2.2
chmod(path, mode)~
Change the mode of {path} to the numeric {mode}. {mode} may take one of the
following values (as defined in the stat (|py2stdlib-stat|) module) or bitwise ORed
combinations of them:
* stat.S_ISUID
* stat.S_ISGID
* stat.S_ENFMT
* stat.S_ISVTX
* stat.S_IREAD
* stat.S_IWRITE
* stat.S_IEXEC
* stat.S_IRWXU
* stat.S_IRUSR
* stat.S_IWUSR
* stat.S_IXUSR
* stat.S_IRWXG
* stat.S_IRGRP
* stat.S_IWGRP
* stat.S_IXGRP
* stat.S_IRWXO
* stat.S_IROTH
* stat.S_IWOTH
* stat.S_IXOTH
Availability: Unix, Windows.
.. note:: >
Although Windows supports chmod, you can only set the file's read-only
flag with it (via the ``stat.S_IWRITE`` and ``stat.S_IREAD``
constants or a corresponding integer value). All other bits are
ignored.
<
chown(path, uid, gid)~
Change the owner and group id of {path} to the numeric {uid} and {gid}. To leave
one of the ids unchanged, set it to -1.
Availability: Unix.
lchflags(path, flags)~
Set the flags of {path} to the numeric {flags}, like chflags, but do not
follow symbolic links.
Availability: Unix.
.. versionadded:: 2.6
lchmod(path, mode)~
Change the mode of {path} to the numeric {mode}. If path is a symlink, this
affects the symlink rather than the target. See the docs for chmod
for possible values of {mode}.
Availability: Unix.
.. versionadded:: 2.6
lchown(path, uid, gid)~
Change the owner and group id of {path} to the numeric {uid} and {gid}. This
function will not follow symbolic links.
Availability: Unix.
.. versionadded:: 2.3
link(source, link_name)~
Create a hard link pointing to {source} named {link_name}.
Availability: Unix.
listdir(path)~
Return a list containing the names of the entries in the directory given by
{path}. The list is in arbitrary order. It does not include the special
entries ``'.'`` and ``'..'`` even if they are present in the
directory.
Availability: Unix, Windows.
.. versionchanged:: 2.3
On Windows NT/2k/XP and Unix, if {path} is a Unicode object, the result will be
a list of Unicode objects. Undecodable filenames will still be returned as
string objects.
lstat(path)~
Like stat (|py2stdlib-stat|), but do not follow symbolic links. This is an alias for
stat (|py2stdlib-stat|) on platforms that do not support symbolic links, such as
Windows.
mkfifo(path[, mode])~
Create a FIFO (a named pipe) named {path} with numeric mode {mode}. The default
{mode} is ``0666`` (octal). The current umask value is first masked out from
the mode.
Availability: Unix.
FIFOs are pipes that can be accessed like regular files. FIFOs exist until they
are deleted (for example with os.unlink). Generally, FIFOs are used as
rendezvous between "client" and "server" type processes: the server opens the
FIFO for reading, and the client opens it for writing. Note that mkfifo
doesn't open the FIFO --- it just creates the rendezvous point.
mknod(filename[, mode=0600, device])~
Create a filesystem node (file, device special file or named pipe) named
{filename}. {mode} specifies both the permissions to use and the type of node to
be created, being combined (bitwise OR) with one of ``stat.S_IFREG``,
``stat.S_IFCHR``, ``stat.S_IFBLK``,
and ``stat.S_IFIFO`` (those constants are available in stat (|py2stdlib-stat|)).
For ``stat.S_IFCHR`` and
``stat.S_IFBLK``, {device} defines the newly created device special file (probably using
os.makedev), otherwise it is ignored.
.. versionadded:: 2.3
major(device)~
Extract the device major number from a raw device number (usually the
st_dev or st_rdev field from stat (|py2stdlib-stat|)).
.. versionadded:: 2.3
minor(device)~
Extract the device minor number from a raw device number (usually the
st_dev or st_rdev field from stat (|py2stdlib-stat|)).
.. versionadded:: 2.3
makedev(major, minor)~
Compose a raw device number from the major and minor device numbers.
.. versionadded:: 2.3
mkdir(path[, mode])~
Create a directory named {path} with numeric mode {mode}. The default {mode} is
``0777`` (octal). On some systems, {mode} is ignored. Where it is used, the
current umask value is first masked out. If the directory already exists,
OSError is raised.
It is also possible to create temporary directories; see the
tempfile (|py2stdlib-tempfile|) module's tempfile.mkdtemp function.
Availability: Unix, Windows.
makedirs(path[, mode])~
.. index::
single: directory; creating
single: UNC paths; and os.makedirs()
Recursive directory creation function. Like mkdir, but makes all
intermediate-level directories needed to contain the leaf directory. Throws an
error exception if the leaf directory already exists or cannot be
created. The default {mode} is ``0777`` (octal). On some systems, {mode} is
ignored. Where it is used, the current umask value is first masked out.
.. note:: >
makedirs will become confused if the path elements to create include
os.pardir.
<
.. versionadded:: 1.5.2
.. versionchanged:: 2.3
This function now handles UNC paths correctly.
pathconf(path, name)~
Return system configuration information relevant to a named file. {name}
specifies the configuration value to retrieve; it may be a string which is the
name of a defined system value; these names are specified in a number of
standards (POSIX.1, Unix 95, Unix 98, and others). Some platforms define
additional names as well. The names known to the host operating system are
given in the ``pathconf_names`` dictionary. For configuration variables not
included in that mapping, passing an integer for {name} is also accepted.
If {name} is a string and is not known, ValueError is raised. If a
specific value for {name} is not supported by the host system, even if it is
included in ``pathconf_names``, an OSError is raised with
errno.EINVAL for the error number.
Availability: Unix.
pathconf_names~
Dictionary mapping names accepted by pathconf and fpathconf to
the integer values defined for those names by the host operating system. This
can be used to determine the set of names known to the system. Availability:
Unix.
readlink(path)~
Return a string representing the path to which the symbolic link points. The
result may be either an absolute or relative pathname; if it is relative, it may
be converted to an absolute pathname using ``os.path.join(os.path.dirname(path),
result)``.
.. versionchanged:: 2.6
If the {path} is a Unicode object the result will also be a Unicode object.
Availability: Unix.
remove(path)~
Remove (delete) the file {path}. If {path} is a directory, OSError is
raised; see rmdir below to remove a directory. This is identical to
the unlink function documented below. On Windows, attempting to
remove a file that is in use causes an exception to be raised; on Unix, the
directory entry is removed but the storage allocated to the file is not made
available until the original file is no longer in use.
Availability: Unix, Windows.
removedirs(path)~
.. index:: single: directory; deleting
Remove directories recursively. Works like rmdir except that, if the
leaf directory is successfully removed, removedirs tries to
successively remove every parent directory mentioned in {path} until an error
is raised (which is ignored, because it generally means that a parent directory
is not empty). For example, ``os.removedirs('foo/bar/baz')`` will first remove
the directory ``'foo/bar/baz'``, and then remove ``'foo/bar'`` and ``'foo'`` if
they are empty. Raises OSError if the leaf directory could not be
successfully removed.
.. versionadded:: 1.5.2
rename(src, dst)~
Rename the file or directory {src} to {dst}. If {dst} is a directory,
OSError will be raised. On Unix, if {dst} exists and is a file, it will
be replaced silently if the user has permission. The operation may fail on some
Unix flavors if {src} and {dst} are on different filesystems. If successful,
the renaming will be an atomic operation (this is a POSIX requirement). On
Windows, if {dst} already exists, OSError will be raised even if it is a
file; there may be no way to implement an atomic rename when {dst} names an
existing file.
Availability: Unix, Windows.
renames(old, new)~
Recursive directory or file renaming function. Works like rename, except
creation of any intermediate directories needed to make the new pathname good is
attempted first. After the rename, directories corresponding to rightmost path
segments of the old name will be pruned away using removedirs.
.. versionadded:: 1.5.2
.. note:: >
This function can fail with the new directory structure made if you lack
permissions needed to remove the leaf directory or file.
<
rmdir(path)~
Remove (delete) the directory {path}. Only works when the directory is
empty, otherwise, OSError is raised. In order to remove whole
directory trees, shutil.rmtree can be used.
Availability: Unix, Windows.
stat(path)~
Perform a stat (|py2stdlib-stat|) system call on the given path. The return value is an
object whose attributes correspond to the members of the stat (|py2stdlib-stat|)
structure, namely: st_mode (protection bits), st_ino (inode
number), st_dev (device), st_nlink (number of hard links),
st_uid (user id of owner), st_gid (group id of owner),
st_size (size of file, in bytes), st_atime (time of most recent
access), st_mtime (time of most recent content modification),
st_ctime (platform dependent; time of most recent metadata change on
Unix, or the time of creation on Windows):: >
>>> import os
>>> statinfo = os.stat('somefile.txt')
>>> statinfo
(33188, 422511L, 769L, 1, 1032, 100, 926L, 1105022698,1105022732, 1105022732)
>>> statinfo.st_size
926L
>>>
<
.. versionchanged:: 2.3
If stat_float_times returns ``True``, the time values are floats, measuring
seconds. Fractions of a second may be reported if the system supports that. On
Mac OS, the times are always floats. See stat_float_times for further
discussion.
On some Unix systems (such as Linux), the following attributes may also be
available: st_blocks (number of blocks allocated for file),
st_blksize (filesystem blocksize), st_rdev (type of device if an
inode device). st_flags (user defined flags for file).
On other Unix systems (such as FreeBSD), the following attributes may be
available (but may be only filled out if root tries to use them): st_gen
(file generation number), st_birthtime (time of file creation).
On Mac OS systems, the following attributes may also be available:
st_rsize, st_creator, st_type.
On RISCOS systems, the following attributes are also available: st_ftype
(file type), st_attrs (attributes), st_obtype (object type).
.. index:: module: stat
For backward compatibility, the return value of stat (|py2stdlib-stat|) is also accessible
as a tuple of at least 10 integers giving the most important (and portable)
members of the stat (|py2stdlib-stat|) structure, in the order st_mode,
st_ino, st_dev, st_nlink, st_uid,
st_gid, st_size, st_atime, st_mtime,
st_ctime. More items may be added at the end by some implementations.
The standard module stat (|py2stdlib-stat|) defines functions and constants that are useful
for extracting information from a stat (|py2stdlib-stat|) structure. (On Windows, some
items are filled with dummy values.)
.. note:: >
The exact meaning and resolution of the st_atime, st_mtime, and
st_ctime members depends on the operating system and the file system.
For example, on Windows systems using the FAT or FAT32 file systems,
st_mtime has 2-second resolution, and st_atime has only 1-day
resolution. See your operating system documentation for details.
<
Availability: Unix, Windows.
.. versionchanged:: 2.2
Added access to values as attributes of the returned object.
.. versionchanged:: 2.5
Added st_gen and st_birthtime.
stat_float_times([newvalue])~
Determine whether stat_result represents time stamps as float objects.
If {newvalue} is ``True``, future calls to stat (|py2stdlib-stat|) return floats, if it is
``False``, future calls return ints. If {newvalue} is omitted, return the
current setting.
For compatibility with older Python versions, accessing stat_result as
a tuple always returns integers.
.. versionchanged:: 2.5
Python now returns float values by default. Applications which do not work
correctly with floating point time stamps can use this function to restore the
old behaviour.
The resolution of the timestamps (that is the smallest possible fraction)
depends on the system. Some systems only support second resolution; on these
systems, the fraction will always be zero.
It is recommended that this setting is only changed at program startup time in
the {__main__} module; libraries should never change this setting. If an
application uses a library that works incorrectly if floating point time stamps
are processed, this application should turn the feature off until the library
has been corrected.
statvfs(path)~
Perform a statvfs (|py2stdlib-statvfs|) system call on the given path. The return value is
an object whose attributes describe the filesystem on the given path, and
correspond to the members of the statvfs (|py2stdlib-statvfs|) structure, namely:
f_bsize, f_frsize, f_blocks, f_bfree,
f_bavail, f_files, f_ffree, f_favail,
f_flag, f_namemax.
.. index:: module: statvfs
For backward compatibility, the return value is also accessible as a tuple whose
values correspond to the attributes, in the order given above. The standard
module statvfs (|py2stdlib-statvfs|) defines constants that are useful for extracting
information from a statvfs (|py2stdlib-statvfs|) structure when accessing it as a sequence;
this remains useful when writing code that needs to work with versions of Python
that don't support accessing the fields as attributes.
Availability: Unix.
.. versionchanged:: 2.2
Added access to values as attributes of the returned object.
symlink(source, link_name)~
Create a symbolic link pointing to {source} named {link_name}.
Availability: Unix.
tempnam([dir[, prefix]])~
Return a unique path name that is reasonable for creating a temporary file.
This will be an absolute path that names a potential directory entry in the
directory {dir} or a common location for temporary files if {dir} is omitted or
``None``. If given and not ``None``, {prefix} is used to provide a short prefix
to the filename. Applications are responsible for properly creating and
managing files created using paths returned by tempnam; no automatic
cleanup is provided. On Unix, the environment variable TMPDIR
overrides {dir}, while on Windows TMP is used. The specific
behavior of this function depends on the C library implementation; some aspects
are underspecified in system documentation.
.. warning:: >
Use of tempnam is vulnerable to symlink attacks; consider using
tmpfile (section os-newstreams) instead.
<
Availability: Unix, Windows.
tmpnam()~
Return a unique path name that is reasonable for creating a temporary file.
This will be an absolute path that names a potential directory entry in a common
location for temporary files. Applications are responsible for properly
creating and managing files created using paths returned by tmpnam; no
automatic cleanup is provided.
.. warning:: >
Use of tmpnam is vulnerable to symlink attacks; consider using
tmpfile (section os-newstreams) instead.
<
Availability: Unix, Windows. This function probably shouldn't be used on
Windows, though: Microsoft's implementation of tmpnam always creates a
name in the root directory of the current drive, and that's generally a poor
location for a temp file (depending on privileges, you may not even be able to
open a file using this name).
TMP_MAX~
The maximum number of unique names that tmpnam will generate before
reusing names.
unlink(path)~
Remove (delete) the file {path}. This is the same function as
remove; the unlink name is its traditional Unix
name.
Availability: Unix, Windows.
utime(path, times)~
Set the access and modified times of the file specified by {path}. If {times}
is ``None``, then the file's access and modified times are set to the current
time. (The effect is similar to running the Unix program touch on
the path.) Otherwise, {times} must be a 2-tuple of numbers, of the form
``(atime, mtime)`` which is used to set the access and modified times,
respectively. Whether a directory can be given for {path} depends on whether
the operating system implements directories as files (for example, Windows
does not). Note that the exact times you set here may not be returned by a
subsequent stat (|py2stdlib-stat|) call, depending on the resolution with which your
operating system records access and modification times; see stat (|py2stdlib-stat|).
.. versionchanged:: 2.0
Added support for ``None`` for {times}.
Availability: Unix, Windows.
walk(top[, topdown=True [, onerror=None[, followlinks=False]]])~
.. index::
single: directory; walking
single: directory; traversal
Generate the file names in a directory tree by walking the tree
either top-down or bottom-up. For each directory in the tree rooted at directory
{top} (including {top} itself), it yields a 3-tuple ``(dirpath, dirnames,
filenames)``.
{dirpath} is a string, the path to the directory. {dirnames} is a list of the
names of the subdirectories in {dirpath} (excluding ``'.'`` and ``'..'``).
{filenames} is a list of the names of the non-directory files in {dirpath}.
Note that the names in the lists contain no path components. To get a full path
(which begins with {top}) to a file or directory in {dirpath}, do
``os.path.join(dirpath, name)``.
If optional argument {topdown} is ``True`` or not specified, the triple for a
directory is generated before the triples for any of its subdirectories
(directories are generated top-down). If {topdown} is ``False``, the triple for a
directory is generated after the triples for all of its subdirectories
(directories are generated bottom-up).
When {topdown} is ``True``, the caller can modify the {dirnames} list in-place
(perhaps using del or slice assignment), and walk will only
recurse into the subdirectories whose names remain in {dirnames}; this can be
used to prune the search, impose a specific order of visiting, or even to inform
walk about directories the caller creates or renames before it resumes
walk again. Modifying {dirnames} when {topdown} is ``False`` is
ineffective, because in bottom-up mode the directories in {dirnames} are
generated before {dirpath} itself is generated.
By default errors from the listdir call are ignored. If optional
argument {onerror} is specified, it should be a function; it will be called with
one argument, an OSError instance. It can report the error to continue
with the walk, or raise the exception to abort the walk. Note that the filename
is available as the ``filename`` attribute of the exception object.
By default, walk will not walk down into symbolic links that resolve to
directories. Set {followlinks} to ``True`` to visit directories pointed to by
symlinks, on systems that support them.
.. versionadded:: 2.6
The {followlinks} parameter.
.. note:: >
Be aware that setting {followlinks} to ``True`` can lead to infinite recursion if a
link points to a parent directory of itself. walk does not keep track of
the directories it visited already.
<
.. note::
If you pass a relative pathname, don't change the current working directory
between resumptions of walk. walk never changes the current
directory, and assumes that its caller doesn't either.
This example displays the number of bytes taken by non-directory files in each
directory under the starting directory, except that it doesn't look under any
CVS subdirectory:: >
import os
from os.path import join, getsize
for root, dirs, files in os.walk('python/Lib/email'):
print root, "consumes",
print sum(getsize(join(root, name)) for name in files),
print "bytes in", len(files), "non-directory files"
if 'CVS' in dirs:
dirs.remove('CVS') # don't visit CVS directories
<
In the next example, walking the tree bottom-up is essential: rmdir
doesn't allow deleting a directory before the directory is empty:: >
# Delete everything reachable from the directory named in "top",
# assuming there are no symbolic links.
# CAUTION: This is dangerous! For example, if top == '/', it
# could delete all your disk files.
import os
for root, dirs, files in os.walk(top, topdown=False):
for name in files:
os.remove(os.path.join(root, name))
for name in dirs:
os.rmdir(os.path.join(root, name))
<
.. versionadded:: 2.3
Process Management
------------------
These functions may be used to create and manage processes.
The various exec\* functions take a list of arguments for the new
program loaded into the process. In each case, the first of these arguments is
passed to the new program as its own name rather than as an argument a user may
have typed on a command line. For the C programmer, this is the ``argv[0]``
passed to a program's main. For example, ``os.execv('/bin/echo',
['foo', 'bar'])`` will only print ``bar`` on standard output; ``foo`` will seem
to be ignored.
abort()~
Generate a SIGABRT signal to the current process. On Unix, the default
behavior is to produce a core dump; on Windows, the process immediately returns
an exit code of ``3``. Be aware that programs which use signal.signal
to register a handler for SIGABRT will behave differently.
Availability: Unix, Windows.
execl(path, arg0, arg1, ...)~
execle(path, arg0, arg1, ..., env)
execlp(file, arg0, arg1, ...)
execlpe(file, arg0, arg1, ..., env)
execv(path, args)
execve(path, args, env)
execvp(file, args)
execvpe(file, args, env)
These functions all execute a new program, replacing the current process; they
do not return. On Unix, the new executable is loaded into the current process,
and will have the same process id as the caller. Errors will be reported as
OSError exceptions.
The current process is replaced immediately. Open file objects and
descriptors are not flushed, so if there may be data buffered
on these open files, you should flush them using
sys.stdout.flush or os.fsync before calling an
exec\* function.
The "l" and "v" variants of the exec\* functions differ in how
command-line arguments are passed. The "l" variants are perhaps the easiest
to work with if the number of parameters is fixed when the code is written; the
individual parameters simply become additional parameters to the execl\*
functions. The "v" variants are good when the number of parameters is
variable, with the arguments being passed in a list or tuple as the {args}
parameter. In either case, the arguments to the child process should start with
the name of the command being run, but this is not enforced.
The variants which include a "p" near the end (execlp,
execlpe, execvp, and execvpe) will use the
PATH environment variable to locate the program {file}. When the
environment is being replaced (using one of the exec\*e variants,
discussed in the next paragraph), the new environment is used as the source of
the PATH variable. The other variants, execl, execle,
execv, and execve, will not use the PATH variable to
locate the executable; {path} must contain an appropriate absolute or relative
path.
For execle, execlpe, execve, and execvpe (note
that these all end in "e"), the {env} parameter must be a mapping which is
used to define the environment variables for the new process (these are used
instead of the current process' environment); the functions execl,
execlp, execv, and execvp all cause the new process to
inherit the environment of the current process.
Availability: Unix, Windows.
_exit(n)~
Exit to the system with status {n}, without calling cleanup handlers, flushing
stdio buffers, etc.
Availability: Unix, Windows.
.. note:: >
The standard way to exit is ``sys.exit(n)``. _exit should normally only
be used in the child process after a fork.
<
The following exit codes are defined and can be used with _exit,
although they are not required. These are typically used for system programs
written in Python, such as a mail server's external command delivery program.
.. note::
Some of these may not be available on all Unix platforms, since there is some
variation. These constants are defined where they are defined by the underlying
platform.
EX_OK~
Exit code that means no error occurred.
Availability: Unix.
.. versionadded:: 2.3
EX_USAGE~
Exit code that means the command was used incorrectly, such as when the wrong
number of arguments are given.
Availability: Unix.
.. versionadded:: 2.3
EX_DATAERR~
Exit code that means the input data was incorrect.
Availability: Unix.
.. versionadded:: 2.3
EX_NOINPUT~
Exit code that means an input file did not exist or was not readable.
Availability: Unix.
.. versionadded:: 2.3
EX_NOUSER~
Exit code that means a specified user did not exist.
Availability: Unix.
.. versionadded:: 2.3
EX_NOHOST~
Exit code that means a specified host did not exist.
Availability: Unix.
.. versionadded:: 2.3
EX_UNAVAILABLE~
Exit code that means that a required service is unavailable.
Availability: Unix.
.. versionadded:: 2.3
EX_SOFTWARE~
Exit code that means an internal software error was detected.
Availability: Unix.
.. versionadded:: 2.3
EX_OSERR~
Exit code that means an operating system error was detected, such as the
inability to fork or create a pipe.
Availability: Unix.
.. versionadded:: 2.3
EX_OSFILE~
Exit code that means some system file did not exist, could not be opened, or had
some other kind of error.
Availability: Unix.
.. versionadded:: 2.3
EX_CANTCREAT~
Exit code that means a user specified output file could not be created.
Availability: Unix.
.. versionadded:: 2.3
EX_IOERR~
Exit code that means that an error occurred while doing I/O on some file.
Availability: Unix.
.. versionadded:: 2.3
EX_TEMPFAIL~
Exit code that means a temporary failure occurred. This indicates something
that may not really be an error, such as a network connection that couldn't be
made during a retryable operation.
Availability: Unix.
.. versionadded:: 2.3
EX_PROTOCOL~
Exit code that means that a protocol exchange was illegal, invalid, or not
understood.
Availability: Unix.
.. versionadded:: 2.3
EX_NOPERM~
Exit code that means that there were insufficient permissions to perform the
operation (but not intended for file system problems).
Availability: Unix.
.. versionadded:: 2.3
EX_CONFIG~
Exit code that means that some kind of configuration error occurred.
Availability: Unix.
.. versionadded:: 2.3
EX_NOTFOUND~
Exit code that means something like "an entry was not found".
Availability: Unix.
.. versionadded:: 2.3
fork()~
Fork a child process. Return ``0`` in the child and the child's process id in the
parent. If an error occurs OSError is raised.
Note that some platforms including FreeBSD <= 6.3, Cygwin and OS/2 EMX have
known issues when using fork() from a thread.
Availability: Unix.
forkpty()~
Fork a child process, using a new pseudo-terminal as the child's controlling
terminal. Return a pair of ``(pid, fd)``, where {pid} is ``0`` in the child, the
new child's process id in the parent, and {fd} is the file descriptor of the
master end of the pseudo-terminal. For a more portable approach, use the
pty (|py2stdlib-pty|) module. If an error occurs OSError is raised.
Availability: some flavors of Unix.
kill(pid, sig)~
.. index::
single: process; killing
single: process; signalling
Send signal {sig} to the process {pid}. Constants for the specific signals
available on the host platform are defined in the signal (|py2stdlib-signal|) module.
Windows: The signal.CTRL_C_EVENT and
signal.CTRL_BREAK_EVENT signals are special signals which can
only be sent to console processes which share a common console window,
e.g., some subprocesses. Any other value for {sig} will cause the process
to be unconditionally killed by the TerminateProcess API, and the exit code
will be set to {sig}. The Windows version of kill additionally takes
process handles to be killed.
.. versionadded:: 2.7 Windows support
killpg(pgid, sig)~
.. index::
single: process; killing
single: process; signalling
Send the signal {sig} to the process group {pgid}.
Availability: Unix.
.. versionadded:: 2.3
nice(increment)~
Add {increment} to the process's "niceness". Return the new niceness.
Availability: Unix.
plock(op)~
Lock program segments into memory. The value of {op} (defined in
``<sys/lock.h>``) determines which segments are locked.
Availability: Unix.
popen(...)~
popen2(...)
popen3(...)
popen4(...)
Run child processes, returning opened pipes for communications. These functions
are described in section os-newstreams.
spawnl(mode, path, ...)~
spawnle(mode, path, ..., env)
spawnlp(mode, file, ...)
spawnlpe(mode, file, ..., env)
spawnv(mode, path, args)
spawnve(mode, path, args, env)
spawnvp(mode, file, args)
spawnvpe(mode, file, args, env)
Execute the program {path} in a new process.
(Note that the subprocess (|py2stdlib-subprocess|) module provides more powerful facilities for
spawning new processes and retrieving their results; using that module is
preferable to using these functions. Check especially the
subprocess-replacements section.)
If {mode} is P_NOWAIT, this function returns the process id of the new
process; if {mode} is P_WAIT, returns the process's exit code if it
exits normally, or ``-signal``, where {signal} is the signal that killed the
process. On Windows, the process id will actually be the process handle, so can
be used with the waitpid function.
The "l" and "v" variants of the spawn\* functions differ in how
command-line arguments are passed. The "l" variants are perhaps the easiest
to work with if the number of parameters is fixed when the code is written; the
individual parameters simply become additional parameters to the
spawnl\* functions. The "v" variants are good when the number of
parameters is variable, with the arguments being passed in a list or tuple as
the {args} parameter. In either case, the arguments to the child process must
start with the name of the command being run.
The variants which include a second "p" near the end (spawnlp,
spawnlpe, spawnvp, and spawnvpe) will use the
PATH environment variable to locate the program {file}. When the
environment is being replaced (using one of the spawn\*e variants,
discussed in the next paragraph), the new environment is used as the source of
the PATH variable. The other variants, spawnl,
spawnle, spawnv, and spawnve, will not use the
PATH variable to locate the executable; {path} must contain an
appropriate absolute or relative path.
For spawnle, spawnlpe, spawnve, and spawnvpe
(note that these all end in "e"), the {env} parameter must be a mapping
which is used to define the environment variables for the new process (they are
used instead of the current process' environment); the functions
spawnl, spawnlp, spawnv, and spawnvp all cause
the new process to inherit the environment of the current process. Note that
keys and values in the {env} dictionary must be strings; invalid keys or
values will cause the function to fail, with a return value of ``127``.
As an example, the following calls to spawnlp and spawnvpe are
equivalent:: >
import os
os.spawnlp(os.P_WAIT, 'cp', 'cp', 'index.html', '/dev/null')
L = ['cp', 'index.html', '/dev/null']
os.spawnvpe(os.P_WAIT, 'cp', L, os.environ)
<
Availability: Unix, Windows. spawnlp, spawnlpe, spawnvp
and spawnvpe are not available on Windows.
.. versionadded:: 1.6
P_NOWAIT~
P_NOWAITO
Possible values for the {mode} parameter to the spawn\* family of
functions. If either of these values is given, the spawn\* functions
will return as soon as the new process has been created, with the process id as
the return value.
Availability: Unix, Windows.
.. versionadded:: 1.6
P_WAIT~
Possible value for the {mode} parameter to the spawn\* family of
functions. If this is given as {mode}, the spawn\* functions will not
return until the new process has run to completion and will return the exit code
of the process the run is successful, or ``-signal`` if a signal kills the
process.
Availability: Unix, Windows.
.. versionadded:: 1.6
P_DETACH~
P_OVERLAY
Possible values for the {mode} parameter to the spawn\* family of
functions. These are less portable than those listed above. P_DETACH
is similar to P_NOWAIT, but the new process is detached from the
console of the calling process. If P_OVERLAY is used, the current
process will be replaced; the spawn\* function will not return.
Availability: Windows.
.. versionadded:: 1.6
startfile(path[, operation])~
Start a file with its associated application.
When {operation} is not specified or ``'open'``, this acts like double-clicking
the file in Windows Explorer, or giving the file name as an argument to the
start command from the interactive command shell: the file is opened
with whatever application (if any) its extension is associated.
When another {operation} is given, it must be a "command verb" that specifies
what should be done with the file. Common verbs documented by Microsoft are
``'print'`` and ``'edit'`` (to be used on files) as well as ``'explore'`` and
``'find'`` (to be used on directories).
startfile returns as soon as the associated application is launched.
There is no option to wait for the application to close, and no way to retrieve
the application's exit status. The {path} parameter is relative to the current
directory. If you want to use an absolute path, make sure the first character
is not a slash (``'/'``); the underlying Win32 ShellExecute function
doesn't work if it is. Use the os.path.normpath function to ensure that
the path is properly encoded for Win32.
Availability: Windows.
.. versionadded:: 2.0
.. versionadded:: 2.5
The {operation} parameter.
system(command)~
Execute the command (a string) in a subshell. This is implemented by calling
the Standard C function system, and has the same limitations.
Changes to sys.stdin, etc. are not reflected in the environment of the
executed command.
On Unix, the return value is the exit status of the process encoded in the
format specified for wait. Note that POSIX does not specify the meaning
of the return value of the C system function, so the return value of
the Python function is system-dependent.
On Windows, the return value is that returned by the system shell after running
{command}, given by the Windows environment variable COMSPEC: on
command.com systems (Windows 95, 98 and ME) this is always ``0``; on
cmd.exe systems (Windows NT, 2000 and XP) this is the exit status of
the command run; on systems using a non-native shell, consult your shell
documentation.
The subprocess (|py2stdlib-subprocess|) module provides more powerful facilities for spawning new
processes and retrieving their results; using that module is preferable to using
this function. Use the subprocess (|py2stdlib-subprocess|) module. Check especially the
subprocess-replacements section.
Availability: Unix, Windows.
times()~
Return a 5-tuple of floating point numbers indicating accumulated (processor
or other) times, in seconds. The items are: user time, system time,
children's user time, children's system time, and elapsed real time since a
fixed point in the past, in that order. See the Unix manual page
times(2) or the corresponding Windows Platform API documentation.
On Windows, only the first two items are filled, the others are zero.
Availability: Unix, Windows
wait()~
Wait for completion of a child process, and return a tuple containing its pid
and exit status indication: a 16-bit number, whose low byte is the signal number
that killed the process, and whose high byte is the exit status (if the signal
number is zero); the high bit of the low byte is set if a core file was
produced.
Availability: Unix.
waitpid(pid, options)~
The details of this function differ on Unix and Windows.
On Unix: Wait for completion of a child process given by process id {pid}, and
return a tuple containing its process id and exit status indication (encoded as
for wait). The semantics of the call are affected by the value of the
integer {options}, which should be ``0`` for normal operation.
If {pid} is greater than ``0``, waitpid requests status information for
that specific process. If {pid} is ``0``, the request is for the status of any
child in the process group of the current process. If {pid} is ``-1``, the
request pertains to any child of the current process. If {pid} is less than
``-1``, status is requested for any process in the process group ``-pid`` (the
absolute value of {pid}).
An OSError is raised with the value of errno when the syscall
returns -1.
On Windows: Wait for completion of a process given by process handle {pid}, and
return a tuple containing {pid}, and its exit status shifted left by 8 bits
(shifting makes cross-platform use of the function easier). A {pid} less than or
equal to ``0`` has no special meaning on Windows, and raises an exception. The
value of integer {options} has no effect. {pid} can refer to any process whose
id is known, not necessarily a child process. The spawn functions called
with P_NOWAIT return suitable process handles.
wait3([options])~
Similar to waitpid, except no process id argument is given and a
3-element tuple containing the child's process id, exit status indication, and
resource usage information is returned. Refer to resource (|py2stdlib-resource|).\
getrusage for details on resource usage information. The option
argument is the same as that provided to waitpid and wait4.
Availability: Unix.
.. versionadded:: 2.5
wait4(pid, options)~
Similar to waitpid, except a 3-element tuple, containing the child's
process id, exit status indication, and resource usage information is returned.
Refer to resource (|py2stdlib-resource|).\ getrusage for details on resource usage
information. The arguments to wait4 are the same as those provided to
waitpid.
Availability: Unix.
.. versionadded:: 2.5
WNOHANG~
The option for waitpid to return immediately if no child process status
is available immediately. The function returns ``(0, 0)`` in this case.
Availability: Unix.
WCONTINUED~
This option causes child processes to be reported if they have been continued
from a job control stop since their status was last reported.
Availability: Some Unix systems.
.. versionadded:: 2.3
WUNTRACED~
This option causes child processes to be reported if they have been stopped but
their current state has not been reported since they were stopped.
Availability: Unix.
.. versionadded:: 2.3
The following functions take a process status code as returned by
system, wait, or waitpid as a parameter. They may be
used to determine the disposition of a process.
WCOREDUMP(status)~
Return ``True`` if a core dump was generated for the process, otherwise
return ``False``.
Availability: Unix.
.. versionadded:: 2.3
WIFCONTINUED(status)~
Return ``True`` if the process has been continued from a job control stop,
otherwise return ``False``.
Availability: Unix.
.. versionadded:: 2.3
WIFSTOPPED(status)~
Return ``True`` if the process has been stopped, otherwise return
``False``.
Availability: Unix.
WIFSIGNALED(status)~
Return ``True`` if the process exited due to a signal, otherwise return
``False``.
Availability: Unix.
WIFEXITED(status)~
Return ``True`` if the process exited using the exit(2) system call,
otherwise return ``False``.
Availability: Unix.
WEXITSTATUS(status)~
If ``WIFEXITED(status)`` is true, return the integer parameter to the
exit(2) system call. Otherwise, the return value is meaningless.
Availability: Unix.
WSTOPSIG(status)~
Return the signal which caused the process to stop.
Availability: Unix.
WTERMSIG(status)~
Return the signal which caused the process to exit.
Availability: Unix.
Miscellaneous System Information
--------------------------------
confstr(name)~
Return string-valued system configuration values. {name} specifies the
configuration value to retrieve; it may be a string which is the name of a
defined system value; these names are specified in a number of standards (POSIX,
Unix 95, Unix 98, and others). Some platforms define additional names as well.
The names known to the host operating system are given as the keys of the
``confstr_names`` dictionary. For configuration variables not included in that
mapping, passing an integer for {name} is also accepted.
If the configuration value specified by {name} isn't defined, ``None`` is
returned.
If {name} is a string and is not known, ValueError is raised. If a
specific value for {name} is not supported by the host system, even if it is
included in ``confstr_names``, an OSError is raised with
errno.EINVAL for the error number.
Availability: Unix
confstr_names~
Dictionary mapping names accepted by confstr to the integer values
defined for those names by the host operating system. This can be used to
determine the set of names known to the system.
Availability: Unix.
getloadavg()~
Return the number of processes in the system run queue averaged over the last
1, 5, and 15 minutes or raises OSError if the load average was
unobtainable.
Availability: Unix.
.. versionadded:: 2.3
sysconf(name)~
Return integer-valued system configuration values. If the configuration value
specified by {name} isn't defined, ``-1`` is returned. The comments regarding
the {name} parameter for confstr apply here as well; the dictionary that
provides information on the known names is given by ``sysconf_names``.
Availability: Unix.
sysconf_names~
Dictionary mapping names accepted by sysconf to the integer values
defined for those names by the host operating system. This can be used to
determine the set of names known to the system.
Availability: Unix.
The following data values are used to support path manipulation operations. These
are defined for all platforms.
Higher-level operations on pathnames are defined in the os.path (|py2stdlib-os.path|) module.
curdir~
The constant string used by the operating system to refer to the current
directory. This is ``'.'`` for Windows and POSIX. Also available via
os.path (|py2stdlib-os.path|).
pardir~
The constant string used by the operating system to refer to the parent
directory. This is ``'..'`` for Windows and POSIX. Also available via
os.path (|py2stdlib-os.path|).
sep~
The character used by the operating system to separate pathname components.
This is ``'/'`` for POSIX and ``'\\'`` for Windows. Note that knowing this
is not sufficient to be able to parse or concatenate pathnames --- use
os.path.split and os.path.join --- but it is occasionally
useful. Also available via os.path (|py2stdlib-os.path|).
altsep~
An alternative character used by the operating system to separate pathname
components, or ``None`` if only one separator character exists. This is set to
``'/'`` on Windows systems where ``sep`` is a backslash. Also available via
os.path (|py2stdlib-os.path|).
extsep~
The character which separates the base filename from the extension; for example,
the ``'.'`` in os.py. Also available via os.path (|py2stdlib-os.path|).
.. versionadded:: 2.2
pathsep~
The character conventionally used by the operating system to separate search
path components (as in PATH), such as ``':'`` for POSIX or ``';'`` for
Windows. Also available via os.path (|py2stdlib-os.path|).
defpath~
The default search path used by exec\{p\} and spawn\{p\} if the
environment doesn't have a ``'PATH'`` key. Also available via os.path (|py2stdlib-os.path|).
linesep~
The string used to separate (or, rather, terminate) lines on the current
platform. This may be a single character, such as ``'\n'`` for POSIX, or
multiple characters, for example, ``'\r\n'`` for Windows. Do not use
{os.linesep} as a line terminator when writing files opened in text mode (the
default); use a single ``'\n'`` instead, on all platforms.
devnull~
The file path of the null device. For example: ``'/dev/null'`` for
POSIX, ``'nul'`` for Windows. Also available via os.path (|py2stdlib-os.path|).
.. versionadded:: 2.4
Miscellaneous Functions
-----------------------
urandom(n)~
Return a string of {n} random bytes suitable for cryptographic use.
This function returns random bytes from an OS-specific randomness source. The
returned data should be unpredictable enough for cryptographic applications,
though its exact quality depends on the OS implementation. On a UNIX-like
system this will query /dev/urandom, and on Windows it will use CryptGenRandom.
If a randomness source is not found, NotImplementedError will be raised.
.. versionadded:: 2.4
==============================================================================
*py2stdlib-ossaudiodev*
ossaudiodev~
:platform: Linux, FreeBSD
:synopsis: Access to OSS-compatible audio devices.
.. versionadded:: 2.3
This module allows you to access the OSS (Open Sound System) audio interface.
OSS is available for a wide range of open-source and commercial Unices, and is
the standard audio interface for Linux and recent versions of FreeBSD.
.. Things will get more complicated for future Linux versions, since
ALSA is in the standard kernel as of 2.5.x. Presumably if you
use ALSA, you'll have to make sure its OSS compatibility layer
is active to use ossaudiodev, but you're gonna need it for the vast
majority of Linux audio apps anyways.
Sounds like things are also complicated for other BSDs. In response
to my python-dev query, Thomas Wouters said:
> Likewise, googling shows OpenBSD also uses OSS/Free -- the commercial
> OSS installation manual tells you to remove references to OSS/Free from the
> kernel :)
but Aleksander Piotrowsk actually has an OpenBSD box, and he quotes
from its <soundcard.h>:
> * WARNING! WARNING!
> * This is an OSS (Linux) audio emulator.
> * Use the Native NetBSD API for developing new code, and this
> * only for compiling Linux programs.
There's also an ossaudio manpage on OpenBSD that explains things
further. Presumably NetBSD and OpenBSD have a different standard
audio interface. That's the great thing about standards, there are so
many to choose from ... ;-)
This probably all warrants a footnote or two, but I don't understand
things well enough right now to write it! --GPW
.. seealso::
`Open Sound System Programmer's Guide <http://www.opensound.com/pguide/oss.pdf>`_
the official documentation for the OSS C API
The module defines a large number of constants supplied by the OSS device
driver; see ``<sys/soundcard.h>`` on either Linux or FreeBSD for a listing .
ossaudiodev (|py2stdlib-ossaudiodev|) defines the following variables and functions:
OSSAudioError~
This exception is raised on certain errors. The argument is a string describing
what went wrong.
(If ossaudiodev (|py2stdlib-ossaudiodev|) receives an error from a system call such as
open, write, or ioctl, it raises IOError.
Errors detected directly by ossaudiodev (|py2stdlib-ossaudiodev|) result in OSSAudioError.)
(For backwards compatibility, the exception class is also available as
``ossaudiodev.error``.)
open([device, ]mode)~
Open an audio device and return an OSS audio device object. This object
supports many file-like methods, such as read, write, and
fileno (although there are subtle differences between conventional Unix
read/write semantics and those of OSS audio devices). It also supports a number
of audio-specific methods; see below for the complete list of methods.
{device} is the audio device filename to use. If it is not specified, this
module first looks in the environment variable AUDIODEV for a device
to use. If not found, it falls back to /dev/dsp.
{mode} is one of ``'r'`` for read-only (record) access, ``'w'`` for
write-only (playback) access and ``'rw'`` for both. Since many sound cards
only allow one process to have the recorder or player open at a time, it is a
good idea to open the device only for the activity needed. Further, some
sound cards are half-duplex: they can be opened for reading or writing, but
not both at once.
Note the unusual calling syntax: the {first} argument is optional, and the
second is required. This is a historical artifact for compatibility with the
older linuxaudiodev module which ossaudiodev (|py2stdlib-ossaudiodev|) supersedes.
.. XXX it might also be motivated
by my unfounded-but-still-possibly-true belief that the default
audio device varies unpredictably across operating systems. -GW
openmixer([device])~
Open a mixer device and return an OSS mixer device object. {device} is the
mixer device filename to use. If it is not specified, this module first looks
in the environment variable MIXERDEV for a device to use. If not
found, it falls back to /dev/mixer.
Audio Device Objects
--------------------
Before you can write to or read from an audio device, you must call three
methods in the correct order:
#. setfmt to set the output format
#. channels to set the number of channels
#. speed to set the sample rate
Alternately, you can use the setparameters method to set all three audio
parameters at once. This is more convenient, but may not be as flexible in all
cases.
The audio device objects returned by .open define the following methods
and (read-only) attributes:
oss_audio_device.close()~
Explicitly close the audio device. When you are done writing to or reading from
an audio device, you should explicitly close it. A closed device cannot be used
again.
oss_audio_device.fileno()~
Return the file descriptor associated with the device.
oss_audio_device.read(size)~
Read {size} bytes from the audio input and return them as a Python string.
Unlike most Unix device drivers, OSS audio devices in blocking mode (the
default) will block read until the entire requested amount of data is
available.
oss_audio_device.write(data)~
Write the Python string {data} to the audio device and return the number of
bytes written. If the audio device is in blocking mode (the default), the
entire string is always written (again, this is different from usual Unix device
semantics). If the device is in non-blocking mode, some data may not be written
---see writeall.
oss_audio_device.writeall(data)~
Write the entire Python string {data} to the audio device: waits until the audio
device is able to accept data, writes as much data as it will accept, and
repeats until {data} has been completely written. If the device is in blocking
mode (the default), this has the same effect as write; writeall
is only useful in non-blocking mode. Has no return value, since the amount of
data written is always equal to the amount of data supplied.
The following methods each map to exactly one ioctl system call. The
correspondence is obvious: for example, setfmt corresponds to the
``SNDCTL_DSP_SETFMT`` ioctl, and sync to ``SNDCTL_DSP_SYNC`` (this can
be useful when consulting the OSS documentation). If the underlying
ioctl fails, they all raise IOError.
oss_audio_device.nonblock()~
Put the device into non-blocking mode. Once in non-blocking mode, there is no
way to return it to blocking mode.
oss_audio_device.getfmts()~
Return a bitmask of the audio output formats supported by the soundcard. Some
of the formats supported by OSS are:
+-------------------------+---------------------------------------------+
| Format | Description |
+=========================+=============================================+
| AFMT_MU_LAW | a logarithmic encoding (used by Sun ``.au`` |
| | files and /dev/audio) |
+-------------------------+---------------------------------------------+
| AFMT_A_LAW | a logarithmic encoding |
+-------------------------+---------------------------------------------+
| AFMT_IMA_ADPCM | a 4:1 compressed format defined by the |
| | Interactive Multimedia Association |
+-------------------------+---------------------------------------------+
| AFMT_U8 | Unsigned, 8-bit audio |
+-------------------------+---------------------------------------------+
| AFMT_S16_LE | Signed, 16-bit audio, little-endian byte |
| | order (as used by Intel processors) |
+-------------------------+---------------------------------------------+
| AFMT_S16_BE | Signed, 16-bit audio, big-endian byte order |
| | (as used by 68k, PowerPC, Sparc) |
+-------------------------+---------------------------------------------+
| AFMT_S8 | Signed, 8 bit audio |
+-------------------------+---------------------------------------------+
| AFMT_U16_LE | Unsigned, 16-bit little-endian audio |
+-------------------------+---------------------------------------------+
| AFMT_U16_BE | Unsigned, 16-bit big-endian audio |
+-------------------------+---------------------------------------------+
Consult the OSS documentation for a full list of audio formats, and note that
most devices support only a subset of these formats. Some older devices only
support AFMT_U8; the most common format used today is
AFMT_S16_LE.
oss_audio_device.setfmt(format)~
Try to set the current audio format to {format}---see getfmts for a
list. Returns the audio format that the device was set to, which may not be the
requested format. May also be used to return the current audio format---do this
by passing an "audio format" of AFMT_QUERY.
oss_audio_device.channels(nchannels)~
Set the number of output channels to {nchannels}. A value of 1 indicates
monophonic sound, 2 stereophonic. Some devices may have more than 2 channels,
and some high-end devices may not support mono. Returns the number of channels
the device was set to.
oss_audio_device.speed(samplerate)~
Try to set the audio sampling rate to {samplerate} samples per second. Returns
the rate actually set. Most sound devices don't support arbitrary sampling
rates. Common rates are:
+-------+-------------------------------------------+
| Rate | Description |
+=======+===========================================+
| 8000 | default rate for /dev/audio |
+-------+-------------------------------------------+
| 11025 | speech recording |
+-------+-------------------------------------------+
| 22050 | |
+-------+-------------------------------------------+
| 44100 | CD quality audio (at 16 bits/sample and 2 |
| | channels) |
+-------+-------------------------------------------+
| 96000 | DVD quality audio (at 24 bits/sample) |
+-------+-------------------------------------------+
oss_audio_device.sync()~
Wait until the sound device has played every byte in its buffer. (This happens
implicitly when the device is closed.) The OSS documentation recommends closing
and re-opening the device rather than using sync.
oss_audio_device.reset()~
Immediately stop playing or recording and return the device to a state where it
can accept commands. The OSS documentation recommends closing and re-opening
the device after calling reset.
oss_audio_device.post()~
Tell the driver that there is likely to be a pause in the output, making it
possible for the device to handle the pause more intelligently. You might use
this after playing a spot sound effect, before waiting for user input, or before
doing disk I/O.
The following convenience methods combine several ioctls, or one ioctl and some
simple calculations.
oss_audio_device.setparameters(format, nchannels, samplerate [, strict=False])~
Set the key audio sampling parameters---sample format, number of channels, and
sampling rate---in one method call. {format}, {nchannels}, and {samplerate}
should be as specified in the setfmt, channels, and
speed methods. If {strict} is true, setparameters checks to
see if each parameter was actually set to the requested value, and raises
OSSAudioError if not. Returns a tuple ({format}, {nchannels},
{samplerate}) indicating the parameter values that were actually set by the
device driver (i.e., the same as the return values of setfmt,
channels, and speed).
For example, :: >
(fmt, channels, rate) = dsp.setparameters(fmt, channels, rate)
<
is equivalent to ::
fmt = dsp.setfmt(fmt)
channels = dsp.channels(channels)
rate = dsp.rate(channels)
oss_audio_device.bufsize()~
Returns the size of the hardware buffer, in samples.
oss_audio_device.obufcount()~
Returns the number of samples that are in the hardware buffer yet to be played.
oss_audio_device.obuffree()~
Returns the number of samples that could be queued into the hardware buffer to
be played without blocking.
Audio device objects also support several read-only attributes:
oss_audio_device.closed~
Boolean indicating whether the device has been closed.
oss_audio_device.name~
String containing the name of the device file.
oss_audio_device.mode~
The I/O mode for the file, either ``"r"``, ``"rw"``, or ``"w"``.
Mixer Device Objects
--------------------
The mixer object provides two file-like methods:
oss_mixer_device.close()~
This method closes the open mixer device file. Any further attempts to use the
mixer after this file is closed will raise an IOError.
oss_mixer_device.fileno()~
Returns the file handle number of the open mixer device file.
The remaining methods are specific to audio mixing:
oss_mixer_device.controls()~
This method returns a bitmask specifying the available mixer controls ("Control"
being a specific mixable "channel", such as SOUND_MIXER_PCM or
SOUND_MIXER_SYNTH). This bitmask indicates a subset of all available
mixer controls---the SOUND_MIXER_\* constants defined at module level.
To determine if, for example, the current mixer object supports a PCM mixer, use
the following Python code:: >
mixer=ossaudiodev.openmixer()
if mixer.controls() & (1 << ossaudiodev.SOUND_MIXER_PCM):
# PCM is supported
... code ...
<
For most purposes, the SOUND_MIXER_VOLUME (master volume) and
SOUND_MIXER_PCM controls should suffice---but code that uses the mixer
should be flexible when it comes to choosing mixer controls. On the Gravis
Ultrasound, for example, SOUND_MIXER_VOLUME does not exist.
oss_mixer_device.stereocontrols()~
Returns a bitmask indicating stereo mixer controls. If a bit is set, the
corresponding control is stereo; if it is unset, the control is either
monophonic or not supported by the mixer (use in combination with
controls to determine which).
See the code example for the controls function for an example of getting
data from a bitmask.
oss_mixer_device.reccontrols()~
Returns a bitmask specifying the mixer controls that may be used to record. See
the code example for controls for an example of reading from a bitmask.
oss_mixer_device.get(control)~
Returns the volume of a given mixer control. The returned volume is a 2-tuple
``(left_volume,right_volume)``. Volumes are specified as numbers from 0
(silent) to 100 (full volume). If the control is monophonic, a 2-tuple is still
returned, but both volumes are the same.
Raises OSSAudioError if an invalid control was is specified, or
IOError if an unsupported control is specified.
oss_mixer_device.set(control, (left, right))~
Sets the volume for a given mixer control to ``(left,right)``. ``left`` and
``right`` must be ints and between 0 (silent) and 100 (full volume). On
success, the new volume is returned as a 2-tuple. Note that this may not be
exactly the same as the volume specified, because of the limited resolution of
some soundcard's mixers.
Raises OSSAudioError if an invalid mixer control was specified, or if the
specified volumes were out-of-range.
oss_mixer_device.get_recsrc()~
This method returns a bitmask indicating which control(s) are currently being
used as a recording source.
oss_mixer_device.set_recsrc(bitmask)~
Call this function to specify a recording source. Returns a bitmask indicating
the new recording source (or sources) if successful; raises IOError if an
invalid source was specified. To set the current recording source to the
microphone input:: >
mixer.setrecsrc (1 << ossaudiodev.SOUND_MIXER_MIC)
==============================================================================
*py2stdlib-parser*
parser~
:synopsis: Access parse trees for Python source code.
.. Copyright 1995 Virginia Polytechnic Institute and State University and Fred
L. Drake, Jr. This copyright notice must be distributed on all copies, but
this document otherwise may be distributed as part of the Python
distribution. No fee may be charged for this document in any representation,
either on paper or electronically. This restriction does not affect other
elements in a distributed package in any way.
.. index:: single: parsing; Python source code
The parser (|py2stdlib-parser|) module provides an interface to Python's internal parser and
byte-code compiler. The primary purpose for this interface is to allow Python
code to edit the parse tree of a Python expression and create executable code
from this. This is better than trying to parse and modify an arbitrary Python
code fragment as a string because parsing is performed in a manner identical to
the code forming the application. It is also faster.
.. note::
From Python 2.5 onward, it's much more convenient to cut in at the Abstract
Syntax Tree (AST) generation and compilation stage, using the ast (|py2stdlib-ast|)
module.
The parser (|py2stdlib-parser|) module exports the names documented here also with "st"
replaced by "ast"; this is a legacy from the time when there was no other
AST and has nothing to do with the AST found in Python 2.5. This is also the
reason for the functions' keyword arguments being called {ast}, not {st}.
The "ast" functions will be removed in Python 3.0.
There are a few things to note about this module which are important to making
use of the data structures created. This is not a tutorial on editing the parse
trees for Python code, but some examples of using the parser (|py2stdlib-parser|) module are
presented.
Most importantly, a good understanding of the Python grammar processed by the
internal parser is required. For full information on the language syntax, refer
to reference-index. The parser
itself is created from a grammar specification defined in the file
Grammar/Grammar in the standard Python distribution. The parse trees
stored in the ST objects created by this module are the actual output from the
internal parser when created by the expr or suite functions,
described below. The ST objects created by sequence2st faithfully
simulate those structures. Be aware that the values of the sequences which are
considered "correct" will vary from one version of Python to another as the
formal grammar for the language is revised. However, transporting code from one
Python version to another as source text will always allow correct parse trees
to be created in the target version, with the only restriction being that
migrating to an older version of the interpreter will not support more recent
language constructs. The parse trees are not typically compatible from one
version to another, whereas source code has always been forward-compatible.
Each element of the sequences returned by st2list or st2tuple
has a simple form. Sequences representing non-terminal elements in the grammar
always have a length greater than one. The first element is an integer which
identifies a production in the grammar. These integers are given symbolic names
in the C header file Include/graminit.h and the Python module
symbol (|py2stdlib-symbol|). Each additional element of the sequence represents a component
of the production as recognized in the input string: these are always sequences
which have the same form as the parent. An important aspect of this structure
which should be noted is that keywords used to identify the parent node type,
such as the keyword if in an if_stmt, are included in the
node tree without any special treatment. For example, the if keyword
is represented by the tuple ``(1, 'if')``, where ``1`` is the numeric value
associated with all NAME tokens, including variable and function names
defined by the user. In an alternate form returned when line number information
is requested, the same token might be represented as ``(1, 'if', 12)``, where
the ``12`` represents the line number at which the terminal symbol was found.
Terminal elements are represented in much the same way, but without any child
elements and the addition of the source text which was identified. The example
of the if keyword above is representative. The various types of
terminal symbols are defined in the C header file Include/token.h and
the Python module token (|py2stdlib-token|).
The ST objects are not required to support the functionality of this module,
but are provided for three purposes: to allow an application to amortize the
cost of processing complex parse trees, to provide a parse tree representation
which conserves memory space when compared to the Python list or tuple
representation, and to ease the creation of additional modules in C which
manipulate parse trees. A simple "wrapper" class may be created in Python to
hide the use of ST objects.
The parser (|py2stdlib-parser|) module defines functions for a few distinct purposes. The
most important purposes are to create ST objects and to convert ST objects to
other representations such as parse trees and compiled code objects, but there
are also functions which serve to query the type of parse tree represented by an
ST object.
.. seealso::
Module symbol (|py2stdlib-symbol|)
Useful constants representing internal nodes of the parse tree.
Module token (|py2stdlib-token|)
Useful constants representing leaf nodes of the parse tree and functions for
testing node values.
Creating ST Objects
-------------------
ST objects may be created from source code or from a parse tree. When creating
an ST object from source, different functions are used to create the ``'eval'``
and ``'exec'`` forms.
expr(source)~
The expr function parses the parameter {source} as if it were an input
to ``compile(source, 'file.py', 'eval')``. If the parse succeeds, an ST object
is created to hold the internal parse tree representation, otherwise an
appropriate exception is thrown.
suite(source)~
The suite function parses the parameter {source} as if it were an input
to ``compile(source, 'file.py', 'exec')``. If the parse succeeds, an ST object
is created to hold the internal parse tree representation, otherwise an
appropriate exception is thrown.
sequence2st(sequence)~
This function accepts a parse tree represented as a sequence and builds an
internal representation if possible. If it can validate that the tree conforms
to the Python grammar and all nodes are valid node types in the host version of
Python, an ST object is created from the internal representation and returned
to the called. If there is a problem creating the internal representation, or
if the tree cannot be validated, a ParserError exception is thrown. An
ST object created this way should not be assumed to compile correctly; normal
exceptions thrown by compilation may still be initiated when the ST object is
passed to compilest. This may indicate problems not related to syntax
(such as a MemoryError exception), but may also be due to constructs such
as the result of parsing ``del f(0)``, which escapes the Python parser but is
checked by the bytecode compiler.
Sequences representing terminal tokens may be represented as either two-element
lists of the form ``(1, 'name')`` or as three-element lists of the form ``(1,
'name', 56)``. If the third element is present, it is assumed to be a valid
line number. The line number may be specified for any subset of the terminal
symbols in the input tree.
tuple2st(sequence)~
This is the same function as sequence2st. This entry point is
maintained for backward compatibility.
Converting ST Objects
---------------------
ST objects, regardless of the input used to create them, may be converted to
parse trees represented as list- or tuple- trees, or may be compiled into
executable code objects. Parse trees may be extracted with or without line
numbering information.
st2list(ast[, line_info])~
This function accepts an ST object from the caller in {ast} and returns a
Python list representing the equivalent parse tree. The resulting list
representation can be used for inspection or the creation of a new parse tree in
list form. This function does not fail so long as memory is available to build
the list representation. If the parse tree will only be used for inspection,
st2tuple should be used instead to reduce memory consumption and
fragmentation. When the list representation is required, this function is
significantly faster than retrieving a tuple representation and converting that
to nested lists.
If {line_info} is true, line number information will be included for all
terminal tokens as a third element of the list representing the token. Note
that the line number provided specifies the line on which the token {ends}.
This information is omitted if the flag is false or omitted.
st2tuple(ast[, line_info])~
This function accepts an ST object from the caller in {ast} and returns a
Python tuple representing the equivalent parse tree. Other than returning a
tuple instead of a list, this function is identical to st2list.
If {line_info} is true, line number information will be included for all
terminal tokens as a third element of the list representing the token. This
information is omitted if the flag is false or omitted.
compilest(ast[, filename='<syntax-tree>'])~
.. index:: builtin: eval
The Python byte compiler can be invoked on an ST object to produce code objects
which can be used as part of an exec statement or a call to the
built-in eval function. This function provides the interface to the
compiler, passing the internal parse tree from {ast} to the parser, using the
source file name specified by the {filename} parameter. The default value
supplied for {filename} indicates that the source was an ST object.
Compiling an ST object may result in exceptions related to compilation; an
example would be a SyntaxError caused by the parse tree for ``del f(0)``:
this statement is considered legal within the formal grammar for Python but is
not a legal language construct. The SyntaxError raised for this
condition is actually generated by the Python byte-compiler normally, which is
why it can be raised at this point by the parser (|py2stdlib-parser|) module. Most causes of
compilation failure can be diagnosed programmatically by inspection of the parse
tree.
Queries on ST Objects
---------------------
Two functions are provided which allow an application to determine if an ST was
created as an expression or a suite. Neither of these functions can be used to
determine if an ST was created from source code via expr or
suite or from a parse tree via sequence2st.
isexpr(ast)~
.. index:: builtin: compile
When {ast} represents an ``'eval'`` form, this function returns true, otherwise
it returns false. This is useful, since code objects normally cannot be queried
for this information using existing built-in functions. Note that the code
objects created by compilest cannot be queried like this either, and
are identical to those created by the built-in compile function.
issuite(ast)~
This function mirrors isexpr in that it reports whether an ST object
represents an ``'exec'`` form, commonly known as a "suite." It is not safe to
assume that this function is equivalent to ``not isexpr(ast)``, as additional
syntactic fragments may be supported in the future.
Exceptions and Error Handling
-----------------------------
The parser module defines a single exception, but may also pass other built-in
exceptions from other portions of the Python runtime environment. See each
function for information about the exceptions it can raise.
ParserError~
Exception raised when a failure occurs within the parser module. This is
generally produced for validation failures rather than the built in
SyntaxError thrown during normal parsing. The exception argument is
either a string describing the reason of the failure or a tuple containing a
sequence causing the failure from a parse tree passed to sequence2st
and an explanatory string. Calls to sequence2st need to be able to
handle either type of exception, while calls to other functions in the module
will only need to be aware of the simple string values.
Note that the functions compilest, expr, and suite may
throw exceptions which are normally thrown by the parsing and compilation
process. These include the built in exceptions MemoryError,
OverflowError, SyntaxError, and SystemError. In these
cases, these exceptions carry all the meaning normally associated with them.
Refer to the descriptions of each function for detailed information.
ST Objects
----------
Ordered and equality comparisons are supported between ST objects. Pickling of
ST objects (using the pickle (|py2stdlib-pickle|) module) is also supported.
STType~
The type of the objects returned by expr, suite and
sequence2st.
ST objects have the following methods:
ST.compile([filename])~
Same as ``compilest(st, filename)``.
ST.isexpr()~
Same as ``isexpr(st)``.
ST.issuite()~
Same as ``issuite(st)``.
ST.tolist([line_info])~
Same as ``st2list(st, line_info)``.
ST.totuple([line_info])~
Same as ``st2tuple(st, line_info)``.
Examples
--------
.. index:: builtin: compile
The parser modules allows operations to be performed on the parse tree of Python
source code before the bytecode is generated, and provides for inspection of the
parse tree for information gathering purposes. Two examples are presented. The
simple example demonstrates emulation of the compile built-in function
and the complex example shows the use of a parse tree for information discovery.
Emulation of compile
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
While many useful operations may take place between parsing and bytecode
generation, the simplest operation is to do nothing. For this purpose, using
the parser (|py2stdlib-parser|) module to produce an intermediate data structure is equivalent
to the code :: >
>>> code = compile('a + 5', 'file.py', 'eval')
>>> a = 5
>>> eval(code)
10
<
The equivalent operation using the parser (|py2stdlib-parser|) module is somewhat longer, and
allows the intermediate internal parse tree to be retained as an ST object:: >
>>> import parser
>>> st = parser.expr('a + 5')
>>> code = st.compile('file.py')
>>> a = 5
>>> eval(code)
10
<
An application which needs both ST and code objects can package this code into
readily available functions:: >
import parser
def load_suite(source_string):
st = parser.suite(source_string)
return st, st.compile()
def load_expression(source_string):
st = parser.expr(source_string)
return st, st.compile()
<
Information Discovery
.. index::
single: string; documentation
single: docstrings
Some applications benefit from direct access to the parse tree. The remainder
of this section demonstrates how the parse tree provides access to module
documentation defined in docstrings without requiring that the code being
examined be loaded into a running interpreter via import. This can
be very useful for performing analyses of untrusted code.
Generally, the example will demonstrate how the parse tree may be traversed to
distill interesting information. Two functions and a set of classes are
developed which provide programmatic access to high level function and class
definitions provided by a module. The classes extract information from the
parse tree and provide access to the information at a useful semantic level, one
function provides a simple low-level pattern matching capability, and the other
function defines a high-level interface to the classes by handling file
operations on behalf of the caller. All source files mentioned here which are
not part of the Python installation are located in the Demo/parser/
directory of the distribution.
The dynamic nature of Python allows the programmer a great deal of flexibility,
but most modules need only a limited measure of this when defining classes,
functions, and methods. In this example, the only definitions that will be
considered are those which are defined in the top level of their context, e.g.,
a function defined by a def statement at column zero of a module, but
not a function defined within a branch of an if ... else
construct, though there are some good reasons for doing so in some situations.
Nesting of definitions will be handled by the code developed in the example.
To construct the upper-level extraction methods, we need to know what the parse
tree structure looks like and how much of it we actually need to be concerned
about. Python uses a moderately deep parse tree so there are a large number of
intermediate nodes. It is important to read and understand the formal grammar
used by Python. This is specified in the file Grammar/Grammar in the
distribution. Consider the simplest case of interest when searching for
docstrings: a module consisting of a docstring and nothing else. (See file
docstring.py.) :: >
"""Some documentation.
"""
<
Using the interpreter to take a look at the parse tree, we find a bewildering
mass of numbers and parentheses, with the documentation buried deep in nested
tuples. :: >
>>> import parser
>>> import pprint
>>> st = parser.suite(open('docstring.py').read())
>>> tup = st.totuple()
>>> pprint.pprint(tup)
(257,
(264,
(265,
(266,
(267,
(307,
(287,
(288,
(289,
(290,
(292,
(293,
(294,
(295,
(296,
(297,
(298,
(299,
(300, (3, '"""Some documentation.\n"""'))))))))))))))))),
(4, ''))),
(4, ''),
(0, ''))
<
The numbers at the first element of each node in the tree are the node types;
they map directly to terminal and non-terminal symbols in the grammar.
Unfortunately, they are represented as integers in the internal representation,
and the Python structures generated do not change that. However, the
symbol (|py2stdlib-symbol|) and token (|py2stdlib-token|) modules provide symbolic names for the node types
and dictionaries which map from the integers to the symbolic names for the node
types.
In the output presented above, the outermost tuple contains four elements: the
integer ``257`` and three additional tuples. Node type ``257`` has the symbolic
name file_input. Each of these inner tuples contains an integer as the
first element; these integers, ``264``, ``4``, and ``0``, represent the node
types stmt, NEWLINE, and ENDMARKER, respectively.
Note that these values may change depending on the version of Python you are
using; consult symbol.py and token.py for details of the
mapping. It should be fairly clear that the outermost node is related primarily
to the input source rather than the contents of the file, and may be disregarded
for the moment. The stmt node is much more interesting. In
particular, all docstrings are found in subtrees which are formed exactly as
this node is formed, with the only difference being the string itself. The
association between the docstring in a similar tree and the defined entity
(class, function, or module) which it describes is given by the position of the
docstring subtree within the tree defining the described structure.
By replacing the actual docstring with something to signify a variable component
of the tree, we allow a simple pattern matching approach to check any given
subtree for equivalence to the general pattern for docstrings. Since the
example demonstrates information extraction, we can safely require that the tree
be in tuple form rather than list form, allowing a simple variable
representation to be ``['variable_name']``. A simple recursive function can
implement the pattern matching, returning a Boolean and a dictionary of variable
name to value mappings. (See file example.py.) :: >
from types import ListType, TupleType
def match(pattern, data, vars=None):
if vars is None:
vars = {}
if type(pattern) is ListType:
vars[pattern[0]] = data
return 1, vars
if type(pattern) is not TupleType:
return (pattern == data), vars
if len(data) != len(pattern):
return 0, vars
for pattern, data in map(None, pattern, data):
same, vars = match(pattern, data, vars)
if not same:
break
return same, vars
<
Using this simple representation for syntactic variables and the symbolic node
types, the pattern for the candidate docstring subtrees becomes fairly readable.
(See file example.py.) :: >
import symbol
import token
DOCSTRING_STMT_PATTERN = (
symbol.stmt,
(symbol.simple_stmt,
(symbol.small_stmt,
(symbol.expr_stmt,
(symbol.testlist,
(symbol.test,
(symbol.and_test,
(symbol.not_test,
(symbol.comparison,
(symbol.expr,
(symbol.xor_expr,
(symbol.and_expr,
(symbol.shift_expr,
(symbol.arith_expr,
(symbol.term,
(symbol.factor,
(symbol.power,
(symbol.atom,
(token.STRING, ['docstring'])
)))))))))))))))),
(token.NEWLINE, '')
))
<
Using the match function with this pattern, extracting the module
docstring from the parse tree created previously is easy:: >
>>> found, vars = match(DOCSTRING_STMT_PATTERN, tup[1])
>>> found
1
>>> vars
{'docstring': '"""Some documentation.\n"""'}
<
Once specific data can be extracted from a location where it is expected, the
question of where information can be expected needs to be answered. When
dealing with docstrings, the answer is fairly simple: the docstring is the first
stmt node in a code block (file_input or suite node
types). A module consists of a single file_input node, and class and
function definitions each contain exactly one suite node. Classes and
functions are readily identified as subtrees of code block nodes which start
with ``(stmt, (compound_stmt, (classdef, ...`` or ``(stmt, (compound_stmt,
(funcdef, ...``. Note that these subtrees cannot be matched by match
since it does not support multiple sibling nodes to match without regard to
number. A more elaborate matching function could be used to overcome this
limitation, but this is sufficient for the example.
Given the ability to determine whether a statement might be a docstring and
extract the actual string from the statement, some work needs to be performed to
walk the parse tree for an entire module and extract information about the names
defined in each context of the module and associate any docstrings with the
names. The code to perform this work is not complicated, but bears some
explanation.
The public interface to the classes is straightforward and should probably be
somewhat more flexible. Each "major" block of the module is described by an
object providing several methods for inquiry and a constructor which accepts at
least the subtree of the complete parse tree which it represents. The
ModuleInfo constructor accepts an optional {name} parameter since it
cannot otherwise determine the name of the module.
The public classes include ClassInfo, FunctionInfo, and
ModuleInfo. All objects provide the methods get_name,
get_docstring, get_class_names, and get_class_info. The
ClassInfo objects support get_method_names and
get_method_info while the other classes provide
get_function_names and get_function_info.
Within each of the forms of code block that the public classes represent, most
of the required information is in the same form and is accessed in the same way,
with classes having the distinction that functions defined at the top level are
referred to as "methods." Since the difference in nomenclature reflects a real
semantic distinction from functions defined outside of a class, the
implementation needs to maintain the distinction. Hence, most of the
functionality of the public classes can be implemented in a common base class,
SuiteInfoBase, with the accessors for function and method information
provided elsewhere. Note that there is only one class which represents function
and method information; this parallels the use of the def statement
to define both types of elements.
Most of the accessor functions are declared in SuiteInfoBase and do not
need to be overridden by subclasses. More importantly, the extraction of most
information from a parse tree is handled through a method called by the
SuiteInfoBase constructor. The example code for most of the classes is
clear when read alongside the formal grammar, but the method which recursively
creates new information objects requires further examination. Here is the
relevant part of the SuiteInfoBase definition from example.py:: >
class SuiteInfoBase:
_docstring = ''
_name = ''
def __init__(self, tree = None):
self._class_info = {}
self._function_info = {}
if tree:
self._extract_info(tree)
def _extract_info(self, tree):
# extract docstring
if len(tree) == 2:
found, vars = match(DOCSTRING_STMT_PATTERN[1], tree[1])
else:
found, vars = match(DOCSTRING_STMT_PATTERN, tree[3])
if found:
self._docstring = eval(vars['docstring'])
# discover inner definitions
for node in tree[1:]:
found, vars = match(COMPOUND_STMT_PATTERN, node)
if found:
cstmt = vars['compound']
if cstmt[0] == symbol.funcdef:
name = cstmt[2][1]
self._function_info[name] = FunctionInfo(cstmt)
elif cstmt[0] == symbol.classdef:
name = cstmt[2][1]
self._class_info[name] = ClassInfo(cstmt)
<
After initializing some internal state, the constructor calls the
_extract_info method. This method performs the bulk of the information
extraction which takes place in the entire example. The extraction has two
distinct phases: the location of the docstring for the parse tree passed in, and
the discovery of additional definitions within the code block represented by the
parse tree.
The initial if test determines whether the nested suite is of the
"short form" or the "long form." The short form is used when the code block is
on the same line as the definition of the code block, as in :: >
def square(x): "Square an argument."; return x {} 2
<
while the long form uses an indented block and allows nested definitions::
def make_power(exp):
"Make a function that raises an argument to the exponent `exp`."
def raiser(x, y=exp):
return x {} y
return raiser
When the short form is used, the code block may contain a docstring as the
first, and possibly only, small_stmt element. The extraction of such a
docstring is slightly different and requires only a portion of the complete
pattern used in the more common case. As implemented, the docstring will only
be found if there is only one small_stmt node in the
simple_stmt node. Since most functions and methods which use the short
form do not provide a docstring, this may be considered sufficient. The
extraction of the docstring proceeds using the match function as
described above, and the value of the docstring is stored as an attribute of the
SuiteInfoBase object.
After docstring extraction, a simple definition discovery algorithm operates on
the stmt nodes of the suite node. The special case of the
short form is not tested; since there are no stmt nodes in the short
form, the algorithm will silently skip the single simple_stmt node and
correctly not discover any nested definitions.
Each statement in the code block is categorized as a class definition, function
or method definition, or something else. For the definition statements, the
name of the element defined is extracted and a representation object appropriate
to the definition is created with the defining subtree passed as an argument to
the constructor. The representation objects are stored in instance variables
and may be retrieved by name using the appropriate accessor methods.
The public classes provide any accessors required which are more specific than
those provided by the SuiteInfoBase class, but the real extraction
algorithm remains common to all forms of code blocks. A high-level function can
be used to extract the complete set of information from a source file. (See
file example.py.) :: >
def get_docs(fileName):
import os
import parser
source = open(fileName).read()
basename = os.path.basename(os.path.splitext(fileName)[0])
st = parser.suite(source)
return ModuleInfo(st.totuple(), basename)
<
This provides an easy-to-use interface to the documentation of a module. If
information is required which is not extracted by the code of this example, the
code may be extended at clearly defined points to provide additional
capabilities.
==============================================================================
*py2stdlib-pdb*
pdb~
:synopsis: The Python debugger for interactive interpreters.
.. index:: single: debugging
The module pdb (|py2stdlib-pdb|) defines an interactive source code debugger for Python
programs. It supports setting (conditional) breakpoints and single stepping at
the source line level, inspection of stack frames, source code listing, and
evaluation of arbitrary Python code in the context of any stack frame. It also
supports post-mortem debugging and can be called under program control.
.. index::
single: Pdb (class in pdb)
module: bdb
module: cmd
The debugger is extensible --- it is actually defined as the class Pdb.
This is currently undocumented but easily understood by reading the source. The
extension interface uses the modules bdb (|py2stdlib-bdb|) and cmd (|py2stdlib-cmd|).
The debugger's prompt is ``(Pdb)``. Typical usage to run a program under control
of the debugger is:: >
>>> import pdb
>>> import mymodule
>>> pdb.run('mymodule.test()')
> <string>(0)?()
(Pdb) continue
> <string>(1)?()
(Pdb) continue
NameError: 'spam'
> <string>(1)?()
(Pdb)
<
pdb.py can also be invoked as a script to debug other scripts. For
example:: >
python -m pdb myscript.py
<
When invoked as a script, pdb will automatically enter post-mortem debugging if
the program being debugged exits abnormally. After post-mortem debugging (or
after normal exit of the program), pdb will restart the program. Automatic
restarting preserves pdb's state (such as breakpoints) and in most cases is more
useful than quitting the debugger upon program's exit.
.. versionadded:: 2.4
Restarting post-mortem behavior added.
The typical usage to break into the debugger from a running program is to
insert :: >
import pdb; pdb.set_trace()
<
at the location you want to break into the debugger. You can then step through
the code following this statement, and continue running without the debugger using
the ``c`` command.
The typical usage to inspect a crashed program is:: >
>>> import pdb
>>> import mymodule
>>> mymodule.test()
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "./mymodule.py", line 4, in test
test2()
File "./mymodule.py", line 3, in test2
print spam
NameError: spam
>>> pdb.pm()
> ./mymodule.py(3)test2()
-> print spam
(Pdb)
<
The module defines the following functions; each enters the debugger in a
slightly different way:
run(statement[, globals[, locals]])~
Execute the {statement} (given as a string) under debugger control. The
debugger prompt appears before any code is executed; you can set breakpoints and
type ``continue``, or you can step through the statement using ``step`` or
``next`` (all these commands are explained below). The optional {globals} and
{locals} arguments specify the environment in which the code is executed; by
default the dictionary of the module __main__ (|py2stdlib-__main__|) is used. (See the
explanation of the exec statement or the eval built-in
function.)
runeval(expression[, globals[, locals]])~
Evaluate the {expression} (given as a string) under debugger control. When
runeval returns, it returns the value of the expression. Otherwise this
function is similar to run.
runcall(function[, argument, ...])~
Call the {function} (a function or method object, not a string) with the given
arguments. When runcall returns, it returns whatever the function call
returned. The debugger prompt appears as soon as the function is entered.
set_trace()~
Enter the debugger at the calling stack frame. This is useful to hard-code a
breakpoint at a given point in a program, even if the code is not otherwise
being debugged (e.g. when an assertion fails).
post_mortem([traceback])~
Enter post-mortem debugging of the given {traceback} object. If no
{traceback} is given, it uses the one of the exception that is currently
being handled (an exception must be being handled if the default is to be
used).
pm()~
Enter post-mortem debugging of the traceback found in
sys.last_traceback.
The ``run_*`` functions and set_trace are aliases for instantiating the
Pdb class and calling the method of the same name. If you want to
access further features, you have to do this yourself:
Pdb(completekey='tab', stdin=None, stdout=None, skip=None)~
Pdb is the debugger class.
The {completekey}, {stdin} and {stdout} arguments are passed to the
underlying cmd.Cmd class; see the description there.
The {skip} argument, if given, must be an iterable of glob-style module name
patterns. The debugger will not step into frames that originate in a module
that matches one of these patterns. [1]_
Example call to enable tracing with {skip}:: >
import pdb; pdb.Pdb(skip=['django.*']).set_trace()
<
.. versionadded:: 2.7
The {skip} argument.
run(statement[, globals[, locals]])~
runeval(expression[, globals[, locals]])
runcall(function[, argument, ...])
set_trace()
See the documentation for the functions explained above.
Debugger Commands
=================
The debugger recognizes the following commands. Most commands can be
abbreviated to one or two letters; e.g. ``h(elp)`` means that either ``h`` or
``help`` can be used to enter the help command (but not ``he`` or ``hel``, nor
``H`` or ``Help`` or ``HELP``). Arguments to commands must be separated by
whitespace (spaces or tabs). Optional arguments are enclosed in square brackets
(``[]``) in the command syntax; the square brackets must not be typed.
Alternatives in the command syntax are separated by a vertical bar (``|``).
Entering a blank line repeats the last command entered. Exception: if the last
command was a ``list`` command, the next 11 lines are listed.
Commands that the debugger doesn't recognize are assumed to be Python statements
and are executed in the context of the program being debugged. Python
statements can also be prefixed with an exclamation point (``!``). This is a
powerful way to inspect the program being debugged; it is even possible to
change a variable or call a function. When an exception occurs in such a
statement, the exception name is printed but the debugger's state is not
changed.
Multiple commands may be entered on a single line, separated by ``;;``. (A
single ``;`` is not used as it is the separator for multiple commands in a line
that is passed to the Python parser.) No intelligence is applied to separating
the commands; the input is split at the first ``;;`` pair, even if it is in the
middle of a quoted string.
The debugger supports aliases. Aliases can have parameters which allows one a
certain level of adaptability to the context under examination.
.. index::
pair: .pdbrc; file
triple: debugger; configuration; file
If a file .pdbrc exists in the user's home directory or in the current
directory, it is read in and executed as if it had been typed at the debugger
prompt. This is particularly useful for aliases. If both files exist, the one
in the home directory is read first and aliases defined there can be overridden
by the local file.
h(elp) [{command}]
Without argument, print the list of available commands. With a {command} as
argument, print help about that command. ``help pdb`` displays the full
documentation file; if the environment variable PAGER is defined, the
file is piped through that command instead. Since the {command} argument must
be an identifier, ``help exec`` must be entered to get help on the ``!``
command.
w(here)
Print a stack trace, with the most recent frame at the bottom. An arrow
indicates the current frame, which determines the context of most commands.
d(own)
Move the current frame one level down in the stack trace (to a newer frame).
u(p)
Move the current frame one level up in the stack trace (to an older frame).
b(reak) [[{filename}:]\ {lineno} | {function}\ [, {condition}]]
With a {lineno} argument, set a break there in the current file. With a
{function} argument, set a break at the first executable statement within that
function. The line number may be prefixed with a filename and a colon, to
specify a breakpoint in another file (probably one that hasn't been loaded yet).
The file is searched on ``sys.path``. Note that each breakpoint is assigned a
number to which all the other breakpoint commands refer.
If a second argument is present, it is an expression which must evaluate to true
before the breakpoint is honored.
Without argument, list all breaks, including for each breakpoint, the number of
times that breakpoint has been hit, the current ignore count, and the associated
condition if any.
tbreak [[{filename}:]\ {lineno} | {function}\ [, {condition}]]
Temporary breakpoint, which is removed automatically when it is first hit. The
arguments are the same as break.
cl(ear) [{bpnumber} [{bpnumber ...}]]
With a space separated list of breakpoint numbers, clear those breakpoints.
Without argument, clear all breaks (but first ask confirmation).
disable [{bpnumber} [{bpnumber ...}]]
Disables the breakpoints given as a space separated list of breakpoint numbers.
Disabling a breakpoint means it cannot cause the program to stop execution, but
unlike clearing a breakpoint, it remains in the list of breakpoints and can be
(re-)enabled.
enable [{bpnumber} [{bpnumber ...}]]
Enables the breakpoints specified.
ignore {bpnumber} [{count}]
Sets the ignore count for the given breakpoint number. If count is omitted, the
ignore count is set to 0. A breakpoint becomes active when the ignore count is
zero. When non-zero, the count is decremented each time the breakpoint is
reached and the breakpoint is not disabled and any associated condition
evaluates to true.
condition {bpnumber} [{condition}]
Condition is an expression which must evaluate to true before the breakpoint is
honored. If condition is absent, any existing condition is removed; i.e., the
breakpoint is made unconditional.
commands [{bpnumber}]
Specify a list of commands for breakpoint number {bpnumber}. The commands
themselves appear on the following lines. Type a line containing just 'end' to
terminate the commands. An example:: >
(Pdb) commands 1
(com) print some_variable
(com) end
(Pdb)
To remove all commands from a breakpoint, type commands and follow it
immediately with end; that is, give no commands.
With no {bpnumber} argument, commands refers to the last breakpoint set.
You can use breakpoint commands to start your program up again. Simply use the
continue command, or step, or any other command that resumes execution.
Specifying any command resuming execution (currently continue, step, next,
return, jump, quit and their abbreviations) terminates the command list (as if
that command was immediately followed by end). This is because any time you
resume execution (even with a simple next or step), you may encounter another
breakpoint--which could have its own command list, leading to ambiguities about
which list to execute.
If you use the 'silent' command in the command list, the usual message about
stopping at a breakpoint is not printed. This may be desirable for breakpoints
that are to print a specific message and then continue. If none of the other
commands print anything, you see no sign that the breakpoint was reached.
.. versionadded:: 2.5
<
s(tep)
Execute the current line, stop at the first possible occasion (either in a
function that is called or on the next line in the current function).
n(ext)
Continue execution until the next line in the current function is reached or it
returns. (The difference between ``next`` and ``step`` is that ``step`` stops
inside a called function, while ``next`` executes called functions at (nearly)
full speed, only stopping at the next line in the current function.)
unt(il)
Continue execution until the line with the line number greater than the
current one is reached or when returning from current frame.
.. versionadded:: 2.6
r(eturn)
Continue execution until the current function returns.
c(ont(inue))
Continue execution, only stop when a breakpoint is encountered.
j(ump) {lineno}
Set the next line that will be executed. Only available in the bottom-most
frame. This lets you jump back and execute code again, or jump forward to skip
code that you don't want to run.
It should be noted that not all jumps are allowed --- for instance it is not
possible to jump into the middle of a for loop or out of a
finally clause.
l(ist) [{first}\ [, {last}]]
List source code for the current file. Without arguments, list 11 lines around
the current line or continue the previous listing. With one argument, list 11
lines around at that line. With two arguments, list the given range; if the
second argument is less than the first, it is interpreted as a count.
a(rgs)
Print the argument list of the current function.
p {expression}
Evaluate the {expression} in the current context and print its value.
.. note:: >
``print`` can also be used, but is not a debugger command --- this executes the
Python print statement.
<
pp {expression}
Like the ``p`` command, except the value of the expression is pretty-printed
using the pprint (|py2stdlib-pprint|) module.
alias [{name} [command]]
Creates an alias called {name} that executes {command}. The command must {not}
be enclosed in quotes. Replaceable parameters can be indicated by ``%1``,
``%2``, and so on, while ``%*`` is replaced by all the parameters. If no
command is given, the current alias for {name} is shown. If no arguments are
given, all aliases are listed.
Aliases may be nested and can contain anything that can be legally typed at the
pdb prompt. Note that internal pdb commands {can} be overridden by aliases.
Such a command is then hidden until the alias is removed. Aliasing is
recursively applied to the first word of the command line; all other words in
the line are left alone.
As an example, here are two useful aliases (especially when placed in the
.pdbrc file):: >
#Print instance variables (usage "pi classInst")
alias pi for k in %1.__dict__.keys(): print "%1.",k,"=",%1.__dict__[k]
#Print instance variables in self
alias ps pi self
<
unalias {name}
Deletes the specified alias.
[!]\ {statement}
Execute the (one-line) {statement} in the context of the current stack frame.
The exclamation point can be omitted unless the first word of the statement
resembles a debugger command. To set a global variable, you can prefix the
assignment command with a ``global`` command on the same line, e.g.:: >
(Pdb) global list_options; list_options = ['-l']
(Pdb)
<
run [{args} ...]
Restart the debugged Python program. If an argument is supplied, it is split
with "shlex" and the result is used as the new sys.argv. History, breakpoints,
actions and debugger options are preserved. "restart" is an alias for "run".
.. versionadded:: 2.6
q(uit)
Quit from the debugger. The program being executed is aborted.
.. rubric:: Footnotes
.. [1] Whether a frame is considered to originate in a certain module
is determined by the ``__name__`` in the frame globals.
==============================================================================
*py2stdlib-pickle*
pickle~
:synopsis: Convert Python objects to streams of bytes and back.
The pickle (|py2stdlib-pickle|) module implements a fundamental, but powerful algorithm for
serializing and de-serializing a Python object structure. "Pickling" is the
process whereby a Python object hierarchy is converted into a byte stream, and
"unpickling" is the inverse operation, whereby a byte stream is converted back
into an object hierarchy. Pickling (and unpickling) is alternatively known as
"serialization", "marshalling," [#]_ or "flattening", however, to avoid
confusion, the terms used here are "pickling" and "unpickling".
This documentation describes both the pickle (|py2stdlib-pickle|) module and the
cPickle (|py2stdlib-cpickle|) module.
Relationship to other Python modules
------------------------------------
The pickle (|py2stdlib-pickle|) module has an optimized cousin called the cPickle (|py2stdlib-cpickle|)
module. As its name implies, cPickle (|py2stdlib-cpickle|) is written in C, so it can be up to
1000 times faster than pickle (|py2stdlib-pickle|). However it does not support subclassing
of the Pickler and Unpickler classes, because in cPickle (|py2stdlib-cpickle|)
these are functions, not classes. Most applications have no need for this
functionality, and can benefit from the improved performance of cPickle (|py2stdlib-cpickle|).
Other than that, the interfaces of the two modules are nearly identical; the
common interface is described in this manual and differences are pointed out
where necessary. In the following discussions, we use the term "pickle" to
collectively describe the pickle (|py2stdlib-pickle|) and cPickle (|py2stdlib-cpickle|) modules.
The data streams the two modules produce are guaranteed to be interchangeable.
Python has a more primitive serialization module called marshal (|py2stdlib-marshal|), but in
general pickle (|py2stdlib-pickle|) should always be the preferred way to serialize Python
objects. marshal (|py2stdlib-marshal|) exists primarily to support Python's .pyc
files.
The pickle (|py2stdlib-pickle|) module differs from marshal (|py2stdlib-marshal|) several significant ways:
* The pickle (|py2stdlib-pickle|) module keeps track of the objects it has already serialized,
so that later references to the same object won't be serialized again.
marshal (|py2stdlib-marshal|) doesn't do this.
This has implications both for recursive objects and object sharing. Recursive
objects are objects that contain references to themselves. These are not
handled by marshal, and in fact, attempting to marshal recursive objects will
crash your Python interpreter. Object sharing happens when there are multiple
references to the same object in different places in the object hierarchy being
serialized. pickle (|py2stdlib-pickle|) stores such objects only once, and ensures that all
other references point to the master copy. Shared objects remain shared, which
can be very important for mutable objects.
* marshal (|py2stdlib-marshal|) cannot be used to serialize user-defined classes and their
instances. pickle (|py2stdlib-pickle|) can save and restore class instances transparently,
however the class definition must be importable and live in the same module as
when the object was stored.
* The marshal (|py2stdlib-marshal|) serialization format is not guaranteed to be portable
across Python versions. Because its primary job in life is to support
.pyc files, the Python implementers reserve the right to change the
serialization format in non-backwards compatible ways should the need arise.
The pickle (|py2stdlib-pickle|) serialization format is guaranteed to be backwards compatible
across Python releases.
.. warning::
The pickle (|py2stdlib-pickle|) module is not intended to be secure against erroneous or
maliciously constructed data. Never unpickle data received from an untrusted
or unauthenticated source.
Note that serialization is a more primitive notion than persistence; although
pickle (|py2stdlib-pickle|) reads and writes file objects, it does not handle the issue of
naming persistent objects, nor the (even more complicated) issue of concurrent
access to persistent objects. The pickle (|py2stdlib-pickle|) module can transform a complex
object into a byte stream and it can transform the byte stream into an object
with the same internal structure. Perhaps the most obvious thing to do with
these byte streams is to write them onto a file, but it is also conceivable to
send them across a network or store them in a database. The module
shelve (|py2stdlib-shelve|) provides a simple interface to pickle and unpickle objects on
DBM-style database files.
Data stream format
------------------
.. index::
single: XDR
single: External Data Representation
The data format used by pickle (|py2stdlib-pickle|) is Python-specific. This has the
advantage that there are no restrictions imposed by external standards such as
XDR (which can't represent pointer sharing); however it means that non-Python
programs may not be able to reconstruct pickled Python objects.
By default, the pickle (|py2stdlib-pickle|) data format uses a printable ASCII representation.
This is slightly more voluminous than a binary representation. The big
advantage of using printable ASCII (and of some other characteristics of
pickle (|py2stdlib-pickle|)'s representation) is that for debugging or recovery purposes it is
possible for a human to read the pickled file with a standard text editor.
There are currently 3 different protocols which can be used for pickling.
* Protocol version 0 is the original ASCII protocol and is backwards compatible
with earlier versions of Python.
* Protocol version 1 is the old binary format which is also compatible with
earlier versions of Python.
* Protocol version 2 was introduced in Python 2.3. It provides much more
efficient pickling of new-style class\es.
Refer to 307 for more information.
If a {protocol} is not specified, protocol 0 is used. If {protocol} is specified
as a negative value or HIGHEST_PROTOCOL, the highest protocol version
available will be used.
.. versionchanged:: 2.3
Introduced the {protocol} parameter.
A binary format, which is slightly more efficient, can be chosen by specifying a
{protocol} version >= 1.
Usage
-----
To serialize an object hierarchy, you first create a pickler, then you call the
pickler's dump method. To de-serialize a data stream, you first create
an unpickler, then you call the unpickler's load method. The
pickle (|py2stdlib-pickle|) module provides the following constant:
HIGHEST_PROTOCOL~
The highest protocol version available. This value can be passed as a
{protocol} value.
.. versionadded:: 2.3
.. note::
Be sure to always open pickle files created with protocols >= 1 in binary mode.
For the old ASCII-based pickle protocol 0 you can use either text mode or binary
mode as long as you stay consistent.
A pickle file written with protocol 0 in binary mode will contain lone linefeeds
as line terminators and therefore will look "funny" when viewed in Notepad or
other editors which do not support this format.
The pickle (|py2stdlib-pickle|) module provides the following functions to make the pickling
process more convenient:
dump(obj, file[, protocol])~
Write a pickled representation of {obj} to the open file object {file}. This is
equivalent to ``Pickler(file, protocol).dump(obj)``.
If the {protocol} parameter is omitted, protocol 0 is used. If {protocol} is
specified as a negative value or HIGHEST_PROTOCOL, the highest protocol
version will be used.
.. versionchanged:: 2.3
Introduced the {protocol} parameter.
{file} must have a write method that accepts a single string argument.
It can thus be a file object opened for writing, a StringIO (|py2stdlib-stringio|) object, or
any other custom object that meets this interface.
load(file)~
Read a string from the open file object {file} and interpret it as a pickle data
stream, reconstructing and returning the original object hierarchy. This is
equivalent to ``Unpickler(file).load()``.
{file} must have two methods, a read method that takes an integer
argument, and a readline (|py2stdlib-readline|) method that requires no arguments. Both
methods should return a string. Thus {file} can be a file object opened for
reading, a StringIO (|py2stdlib-stringio|) object, or any other custom object that meets this
interface.
This function automatically determines whether the data stream was written in
binary mode or not.
dumps(obj[, protocol])~
Return the pickled representation of the object as a string, instead of writing
it to a file.
If the {protocol} parameter is omitted, protocol 0 is used. If {protocol} is
specified as a negative value or HIGHEST_PROTOCOL, the highest protocol
version will be used.
.. versionchanged:: 2.3
The {protocol} parameter was added.
loads(string)~
Read a pickled object hierarchy from a string. Characters in the string past
the pickled object's representation are ignored.
The pickle (|py2stdlib-pickle|) module also defines three exceptions:
PickleError~
A common base class for the other exceptions defined below. This inherits from
Exception.
PicklingError~
This exception is raised when an unpicklable object is passed to the
dump method.
UnpicklingError~
This exception is raised when there is a problem unpickling an object. Note that
other exceptions may also be raised during unpickling, including (but not
necessarily limited to) AttributeError, EOFError,
ImportError, and IndexError.
The pickle (|py2stdlib-pickle|) module also exports two callables [#]_, Pickler and
Unpickler:
Pickler(file[, protocol])~
This takes a file-like object to which it will write a pickle data stream.
If the {protocol} parameter is omitted, protocol 0 is used. If {protocol} is
specified as a negative value or HIGHEST_PROTOCOL, the highest
protocol version will be used.
.. versionchanged:: 2.3
Introduced the {protocol} parameter.
{file} must have a write method that accepts a single string argument.
It can thus be an open file object, a StringIO (|py2stdlib-stringio|) object, or any other
custom object that meets this interface.
Pickler objects define one (or two) public methods:
dump(obj)~
Write a pickled representation of {obj} to the open file object given in the
constructor. Either the binary or ASCII format will be used, depending on the
value of the {protocol} argument passed to the constructor.
clear_memo()~
Clears the pickler's "memo". The memo is the data structure that remembers
which objects the pickler has already seen, so that shared or recursive objects
pickled by reference and not by value. This method is useful when re-using
picklers.
.. note:: >
Prior to Python 2.3, clear_memo was only available on the picklers
created by cPickle (|py2stdlib-cpickle|). In the pickle (|py2stdlib-pickle|) module, picklers have an
instance variable called memo which is a Python dictionary. So to clear
the memo for a pickle (|py2stdlib-pickle|) module pickler, you could do the following::
mypickler.memo.clear()
Code that does not need to support older versions of Python should simply use
clear_memo.
<
It is possible to make multiple calls to the dump method of the same
Pickler instance. These must then be matched to the same number of
calls to the load method of the corresponding Unpickler
instance. If the same object is pickled by multiple dump calls, the
load will all yield references to the same object. [#]_
Unpickler objects are defined as:
Unpickler(file)~
This takes a file-like object from which it will read a pickle data stream.
This class automatically determines whether the data stream was written in
binary mode or not, so it does not need a flag as in the Pickler
factory.
{file} must have two methods, a read method that takes an integer
argument, and a readline (|py2stdlib-readline|) method that requires no arguments. Both
methods should return a string. Thus {file} can be a file object opened for
reading, a StringIO (|py2stdlib-stringio|) object, or any other custom object that meets this
interface.
Unpickler objects have one (or two) public methods:
load()~
Read a pickled object representation from the open file object given in
the constructor, and return the reconstituted object hierarchy specified
therein.
This method automatically determines whether the data stream was written
in binary mode or not.
noload()~
This is just like load except that it doesn't actually create any
objects. This is useful primarily for finding what's called "persistent
ids" that may be referenced in a pickle data stream. See section
pickle-protocol below for more details.
{Note:}* the noload method is currently only available on
Unpickler objects created with the cPickle (|py2stdlib-cpickle|) module.
pickle (|py2stdlib-pickle|) module Unpickler\ s do not have the noload
method.
What can be pickled and unpickled?
----------------------------------
The following types can be pickled:
* ``None``, ``True``, and ``False``
* integers, long integers, floating point numbers, complex numbers
* normal and Unicode strings
* tuples, lists, sets, and dictionaries containing only picklable objects
* functions defined at the top level of a module
* built-in functions defined at the top level of a module
* classes that are defined at the top level of a module
* instances of such classes whose __dict__ or __setstate__ is
picklable (see section pickle-protocol for details)
Attempts to pickle unpicklable objects will raise the PicklingError
exception; when this happens, an unspecified number of bytes may have already
been written to the underlying file. Trying to pickle a highly recursive data
structure may exceed the maximum recursion depth, a RuntimeError will be
raised in this case. You can carefully raise this limit with
sys.setrecursionlimit.
Note that functions (built-in and user-defined) are pickled by "fully qualified"
name reference, not by value. This means that only the function name is
pickled, along with the name of module the function is defined in. Neither the
function's code, nor any of its function attributes are pickled. Thus the
defining module must be importable in the unpickling environment, and the module
must contain the named object, otherwise an exception will be raised. [#]_
Similarly, classes are pickled by named reference, so the same restrictions in
the unpickling environment apply. Note that none of the class's code or data is
pickled, so in the following example the class attribute ``attr`` is not
restored in the unpickling environment:: >
class Foo:
attr = 'a class attr'
picklestring = pickle.dumps(Foo)
<
These restrictions are why picklable functions and classes must be defined in
the top level of a module.
Similarly, when class instances are pickled, their class's code and data are not
pickled along with them. Only the instance data are pickled. This is done on
purpose, so you can fix bugs in a class or add methods to the class and still
load objects that were created with an earlier version of the class. If you
plan to have long-lived objects that will see many versions of a class, it may
be worthwhile to put a version number in the objects so that suitable
conversions can be made by the class's __setstate__ method.
The pickle protocol
-------------------
.. currentmodule:: None
This section describes the "pickling protocol" that defines the interface
between the pickler/unpickler and the objects that are being serialized. This
protocol provides a standard way for you to define, customize, and control how
your objects are serialized and de-serialized. The description in this section
doesn't cover specific customizations that you can employ to make the unpickling
environment slightly safer from untrusted pickle data streams; see section
pickle-sub for more details.
Pickling and unpickling normal class instances
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
object.__getinitargs__()~
When a pickled class instance is unpickled, its __init__ method is
normally {not} invoked. If it is desirable that the __init__ method
be called on unpickling, an old-style class can define a method
__getinitargs__, which should return a {tuple} containing the
arguments to be passed to the class constructor (__init__ for
example). The __getinitargs__ method is called at pickle time; the
tuple it returns is incorporated in the pickle for the instance.
object.__getnewargs__()~
New-style types can provide a __getnewargs__ method that is used for
protocol 2. Implementing this method is needed if the type establishes some
internal invariants when the instance is created, or if the memory allocation
is affected by the values passed to the __new__ method for the type
(as it is for tuples and strings). Instances of a new-style class
``C`` are created using :: >
obj = C.__new__(C, *args)
<
where {args} is the result of calling __getnewargs__ on the original
object; if there is no __getnewargs__, an empty tuple is assumed.
object.__getstate__()~
Classes can further influence how their instances are pickled; if the class
defines the method __getstate__, it is called and the return state is
pickled as the contents for the instance, instead of the contents of the
instance's dictionary. If there is no __getstate__ method, the
instance's __dict__ is pickled.
object.__setstate__()~
Upon unpickling, if the class also defines the method __setstate__,
it is called with the unpickled state. [#]_ If there is no
__setstate__ method, the pickled state must be a dictionary and its
items are assigned to the new instance's dictionary. If a class defines both
__getstate__ and __setstate__, the state object needn't be a
dictionary and these methods can do what they want. [#]_
.. note:: >
For new-style class\es, if __getstate__ returns a false
value, the __setstate__ method will not be called.
<
.. note::
At unpickling time, some methods like __getattr__,
__getattribute__, or __setattr__ may be called upon the
instance. In case those methods rely on some internal invariant being
true, the type should implement either __getinitargs__ or
__getnewargs__ to establish such an invariant; otherwise, neither
__new__ nor __init__ will be called.
Pickling and unpickling extension types
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
object.__reduce__()~
When the Pickler encounters an object of a type it knows nothing
about --- such as an extension type --- it looks in two places for a hint of
how to pickle it. One alternative is for the object to implement a
__reduce__ method. If provided, at pickling time __reduce__
will be called with no arguments, and it must return either a string or a
tuple.
If a string is returned, it names a global variable whose contents are
pickled as normal. The string returned by __reduce__ should be the
object's local name relative to its module; the pickle module searches the
module namespace to determine the object's module.
When a tuple is returned, it must be between two and five elements long.
Optional elements can either be omitted, or ``None`` can be provided as their
value. The contents of this tuple are pickled as normal and used to
reconstruct the object at unpickling time. The semantics of each element
are:
* A callable object that will be called to create the initial version of the
object. The next element of the tuple will provide arguments for this
callable, and later elements provide additional state information that will
subsequently be used to fully reconstruct the pickled data.
In the unpickling environment this object must be either a class, a
callable registered as a "safe constructor" (see below), or it must have an
attribute __safe_for_unpickling__ with a true value. Otherwise, an
UnpicklingError will be raised in the unpickling environment. Note
that as usual, the callable itself is pickled by name.
* A tuple of arguments for the callable object.
.. versionchanged:: 2.5
Formerly, this argument could also be ``None``.
* Optionally, the object's state, which will be passed to the object's
__setstate__ method as described in section pickle-inst. If
the object has no __setstate__ method, then, as above, the value
must be a dictionary and it will be added to the object's __dict__.
* Optionally, an iterator (and not a sequence) yielding successive list
items. These list items will be pickled, and appended to the object using
either ``obj.append(item)`` or ``obj.extend(list_of_items)``. This is
primarily used for list subclasses, but may be used by other classes as
long as they have append and extend methods with the
appropriate signature. (Whether append or extend is used
depends on which pickle protocol version is used as well as the number of
items to append, so both must be supported.)
* Optionally, an iterator (not a sequence) yielding successive dictionary
items, which should be tuples of the form ``(key, value)``. These items
will be pickled and stored to the object using ``obj[key] = value``. This
is primarily used for dictionary subclasses, but may be used by other
classes as long as they implement __setitem__.
object.__reduce_ex__(protocol)~
It is sometimes useful to know the protocol version when implementing
__reduce__. This can be done by implementing a method named
__reduce_ex__ instead of __reduce__. __reduce_ex__,
when it exists, is called in preference over __reduce__ (you may
still provide __reduce__ for backwards compatibility). The
__reduce_ex__ method will be called with a single integer argument,
the protocol version.
The object class implements both __reduce__ and
__reduce_ex__; however, if a subclass overrides __reduce__
but not __reduce_ex__, the __reduce_ex__ implementation
detects this and calls __reduce__.
An alternative to implementing a __reduce__ method on the object to be
pickled, is to register the callable with the copy_reg (|py2stdlib-copy_reg|) module. This
module provides a way for programs to register "reduction functions" and
constructors for user-defined types. Reduction functions have the same
semantics and interface as the __reduce__ method described above, except
that they are called with a single argument, the object to be pickled.
The registered constructor is deemed a "safe constructor" for purposes of
unpickling as described above.
Pickling and unpickling external objects
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. index::
single: persistent_id (pickle protocol)
single: persistent_load (pickle protocol)
For the benefit of object persistence, the pickle (|py2stdlib-pickle|) module supports the
notion of a reference to an object outside the pickled data stream. Such
objects are referenced by a "persistent id", which is just an arbitrary string
of printable ASCII characters. The resolution of such names is not defined by
the pickle (|py2stdlib-pickle|) module; it will delegate this resolution to user defined
functions on the pickler and unpickler. [#]_
To define external persistent id resolution, you need to set the
persistent_id attribute of the pickler object and the
persistent_load attribute of the unpickler object.
To pickle objects that have an external persistent id, the pickler must have a
custom persistent_id method that takes an object as an argument and
returns either ``None`` or the persistent id for that object. When ``None`` is
returned, the pickler simply pickles the object as normal. When a persistent id
string is returned, the pickler will pickle that string, along with a marker so
that the unpickler will recognize the string as a persistent id.
To unpickle external objects, the unpickler must have a custom
persistent_load function that takes a persistent id string and returns
the referenced object.
Here's a silly example that {might} shed more light:: >
import pickle
from cStringIO import StringIO
src = StringIO()
p = pickle.Pickler(src)
def persistent_id(obj):
if hasattr(obj, 'x'):
return 'the value %d' % obj.x
else:
return None
p.persistent_id = persistent_id
class Integer:
def __init__(self, x):
self.x = x
def __str__(self):
return 'My name is integer %d' % self.x
i = Integer(7)
print i
p.dump(i)
datastream = src.getvalue()
print repr(datastream)
dst = StringIO(datastream)
up = pickle.Unpickler(dst)
class FancyInteger(Integer):
def __str__(self):
return 'I am the integer %d' % self.x
def persistent_load(persid):
if persid.startswith('the value '):
value = int(persid.split()[2])
return FancyInteger(value)
else:
raise pickle.UnpicklingError, 'Invalid persistent id'
up.persistent_load = persistent_load
j = up.load()
print j
<
In the cPickle (|py2stdlib-cpickle|) module, the unpickler's persistent_load attribute
can also be set to a Python list, in which case, when the unpickler reaches a
persistent id, the persistent id string will simply be appended to this list.
This functionality exists so that a pickle data stream can be "sniffed" for
object references without actually instantiating all the objects in a pickle.
[#]_ Setting persistent_load to a list is usually used in conjunction
with the noload method on the Unpickler.
.. BAW: Both pickle and cPickle support something called inst_persistent_id()
which appears to give unknown types a second shot at producing a persistent
id. Since Jim Fulton can't remember why it was added or what it's for, I'm
leaving it undocumented.
Subclassing Unpicklers
----------------------
.. index::
single: load_global() (pickle protocol)
single: find_global() (pickle protocol)
By default, unpickling will import any class that it finds in the pickle data.
You can control exactly what gets unpickled and what gets called by customizing
your unpickler. Unfortunately, exactly how you do this is different depending
on whether you're using pickle (|py2stdlib-pickle|) or cPickle (|py2stdlib-cpickle|). [#]_
In the pickle (|py2stdlib-pickle|) module, you need to derive a subclass from
Unpickler, overriding the load_global method.
load_global should read two lines from the pickle data stream where the
first line will the name of the module containing the class and the second line
will be the name of the instance's class. It then looks up the class, possibly
importing the module and digging out the attribute, then it appends what it
finds to the unpickler's stack. Later on, this class will be assigned to the
__class__ attribute of an empty class, as a way of magically creating an
instance without calling its class's __init__. Your job (should you
choose to accept it), would be to have load_global push onto the
unpickler's stack, a known safe version of any class you deem safe to unpickle.
It is up to you to produce such a class. Or you could raise an error if you
want to disallow all unpickling of instances. If this sounds like a hack,
you're right. Refer to the source code to make this work.
Things are a little cleaner with cPickle (|py2stdlib-cpickle|), but not by much. To control
what gets unpickled, you can set the unpickler's find_global attribute
to a function or ``None``. If it is ``None`` then any attempts to unpickle
instances will raise an UnpicklingError. If it is a function, then it
should accept a module name and a class name, and return the corresponding class
object. It is responsible for looking up the class and performing any necessary
imports, and it may raise an error to prevent instances of the class from being
unpickled.
The moral of the story is that you should be really careful about the source of
the strings your application unpickles.
Example
-------
For the simplest code, use the dump and load functions. Note
that a self-referencing list is pickled and restored correctly. :: >
import pickle
data1 = {'a': [1, 2.0, 3, 4+6j],
'b': ('string', u'Unicode string'),
'c': None}
selfref_list = [1, 2, 3]
selfref_list.append(selfref_list)
output = open('data.pkl', 'wb')
# Pickle dictionary using protocol 0.
pickle.dump(data1, output)
# Pickle the list using the highest protocol available.
pickle.dump(selfref_list, output, -1)
output.close()
<
The following example reads the resulting pickled data. When reading a
pickle-containing file, you should open the file in binary mode because you
can't be sure if the ASCII or binary format was used. :: >
import pprint, pickle
pkl_file = open('data.pkl', 'rb')
data1 = pickle.load(pkl_file)
pprint.pprint(data1)
data2 = pickle.load(pkl_file)
pprint.pprint(data2)
pkl_file.close()
<
Here's a larger example that shows how to modify pickling behavior for a class.
The TextReader class opens a text file, and returns the line number and
line contents each time its readline (|py2stdlib-readline|) method is called. If a
TextReader instance is pickled, all attributes {except} the file object
member are saved. When the instance is unpickled, the file is reopened, and
reading resumes from the last location. The __setstate__ and
__getstate__ methods are used to implement this behavior. :: >
#!/usr/local/bin/python
class TextReader:
"""Print and number lines in a text file."""
def __init__(self, file):
self.file = file
self.fh = open(file)
self.lineno = 0
def readline(self):
self.lineno = self.lineno + 1
line = self.fh.readline()
if not line:
return None
if line.endswith("\n"):
line = line[:-1]
return "%d: %s" % (self.lineno, line)
def __getstate__(self):
odict = self.__dict__.copy() # copy the dict since we change it
del odict['fh'] # remove filehandle entry
return odict
def __setstate__(self, dict):
fh = open(dict['file']) # reopen file
count = dict['lineno'] # read from file...
while count: # until line count is restored
fh.readline()
count = count - 1
self.__dict__.update(dict) # update attributes
self.fh = fh # save the file object
<
A sample usage might be something like this::
>>> import TextReader
>>> obj = TextReader.TextReader("TextReader.py")
>>> obj.readline()
'1: #!/usr/local/bin/python'
>>> obj.readline()
'2: '
>>> obj.readline()
'3: class TextReader:'
>>> import pickle
>>> pickle.dump(obj, open('save.p', 'wb'))
If you want to see that pickle (|py2stdlib-pickle|) works across Python processes, start
another Python session, before continuing. What follows can happen from either
the same process or a new process. :: >
>>> import pickle
>>> reader = pickle.load(open('save.p', 'rb'))
>>> reader.readline()
'4: """Print and number lines in a text file."""'
<
.. seealso::
Module copy_reg (|py2stdlib-copy_reg|)
Pickle interface constructor registration for extension types.
Module shelve (|py2stdlib-shelve|)
Indexed databases of objects; uses pickle (|py2stdlib-pickle|).
Module copy (|py2stdlib-copy|)
Shallow and deep object copying.
Module marshal (|py2stdlib-marshal|)
High-performance serialization of built-in types.
==============================================================================
*py2stdlib-pickletools*
pickletools~
:synopsis: Contains extensive comments about the pickle protocols and pickle-machine
opcodes, as well as some useful functions.
.. versionadded:: 2.3
This module contains various constants relating to the intimate details of the
pickle (|py2stdlib-pickle|) module, some lengthy comments about the implementation, and a few
useful functions for analyzing pickled data. The contents of this module are
useful for Python core developers who are working on the pickle (|py2stdlib-pickle|) and
cPickle (|py2stdlib-cpickle|) implementations; ordinary users of the pickle (|py2stdlib-pickle|) module
probably won't find the pickletools (|py2stdlib-pickletools|) module relevant.
dis(pickle[, out=None, memo=None, indentlevel=4])~
Outputs a symbolic disassembly of the pickle to the file-like object {out},
defaulting to ``sys.stdout``. {pickle} can be a string or a file-like object.
{memo} can be a Python dictionary that will be used as the pickle's memo; it can
be used to perform disassemblies across multiple pickles created by the same
pickler. Successive levels, indicated by ``MARK`` opcodes in the stream, are
indented by {indentlevel} spaces.
genops(pickle)~
Provides an iterator over all of the opcodes in a pickle, returning a
sequence of ``(opcode, arg, pos)`` triples. {opcode} is an instance of an
OpcodeInfo class; {arg} is the decoded value, as a Python object, of
the opcode's argument; {pos} is the position at which this opcode is located.
{pickle} can be a string or a file-like object.
optimize(picklestring)~
Returns a new equivalent pickle string after eliminating unused ``PUT``
opcodes. The optimized pickle is shorter, takes less transmission time,
requires less storage space, and unpickles more efficiently.
.. versionadded:: 2.6
==============================================================================
*py2stdlib-pipes*
pipes~
:platform: Unix
:synopsis: A Python interface to Unix shell pipelines.
The pipes (|py2stdlib-pipes|) module defines a class to abstract the concept of a {pipeline}
--- a sequence of converters from one file to another.
Because the module uses /bin/sh command lines, a POSIX or compatible
shell for os.system and os.popen is required.
The pipes (|py2stdlib-pipes|) module defines the following class:
Template()~
An abstraction of a pipeline.
Example:: >
>>> import pipes
>>> t=pipes.Template()
>>> t.append('tr a-z A-Z', '--')
>>> f=t.open('/tmp/1', 'w')
>>> f.write('hello world')
>>> f.close()
>>> open('/tmp/1').read()
'HELLO WORLD'
<
Template Objects
Template objects following methods:
Template.reset()~
Restore a pipeline template to its initial state.
Template.clone()~
Return a new, equivalent, pipeline template.
Template.debug(flag)~
If {flag} is true, turn debugging on. Otherwise, turn debugging off. When
debugging is on, commands to be executed are printed, and the shell is given
``set -x`` command to be more verbose.
Template.append(cmd, kind)~
Append a new action at the end. The {cmd} variable must be a valid bourne shell
command. The {kind} variable consists of two letters.
The first letter can be either of ``'-'`` (which means the command reads its
standard input), ``'f'`` (which means the commands reads a given file on the
command line) or ``'.'`` (which means the commands reads no input, and hence
must be first.)
Similarly, the second letter can be either of ``'-'`` (which means the command
writes to standard output), ``'f'`` (which means the command writes a file on
the command line) or ``'.'`` (which means the command does not write anything,
and hence must be last.)
Template.prepend(cmd, kind)~
Add a new action at the beginning. See append for explanations of the
arguments.
Template.open(file, mode)~
Return a file-like object, open to {file}, but read from or written to by the
pipeline. Note that only one of ``'r'``, ``'w'`` may be given.
Template.copy(infile, outfile)~
Copy {infile} to {outfile} through the pipe.
==============================================================================
*py2stdlib-pkgutil*
pkgutil~
:synopsis: Utilities to support extension of packages.
.. versionadded:: 2.3
This module provides functions to manipulate packages:
extend_path(path, name)~
Extend the search path for the modules which comprise a package. Intended use is
to place the following code in a package's __init__.py:: >
from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)
<
This will add to the package's ``__path__`` all subdirectories of directories on
``sys.path`` named after the package. This is useful if one wants to distribute
different parts of a single logical package as multiple directories.
It also looks for \{.pkg files beginning where ``}`` matches the {name}
argument. This feature is similar to \*.pth files (see the site (|py2stdlib-site|)
module for more information), except that it doesn't special-case lines starting
with ``import``. A \*.pkg file is trusted at face value: apart from
checking for duplicates, all entries found in a \*.pkg file are added to
the path, regardless of whether they exist on the filesystem. (This is a
feature.)
If the input path is not a list (as is the case for frozen packages) it is
returned unchanged. The input path is not modified; an extended copy is
returned. Items are only appended to the copy at the end.
It is assumed that ``sys.path`` is a sequence. Items of ``sys.path`` that are
not (Unicode or 8-bit) strings referring to existing directories are ignored.
Unicode items on ``sys.path`` that cause errors when used as filenames may cause
this function to raise an exception (in line with os.path.isdir
behavior).
get_data(package, resource)~
Get a resource from a package.
This is a wrapper for the 302 loader get_data API. The package
argument should be the name of a package, in standard module format
(foo.bar). The resource argument should be in the form of a relative
filename, using ``/`` as the path separator. The parent directory name
``..`` is not allowed, and nor is a rooted name (starting with a ``/``).
The function returns a binary string that is the contents of the
specified resource.
For packages located in the filesystem, which have already been imported,
this is the rough equivalent of:: >
d = os.path.dirname(sys.modules[package].__file__)
data = open(os.path.join(d, resource), 'rb').read()
<
If the package cannot be located or loaded, or it uses a 302 loader
which does not support get_data, then None is returned.
==============================================================================
*py2stdlib-platform*
platform~
:synopsis: Retrieves as much platform identifying data as possible.
.. versionadded:: 2.3
.. note::
Specific platforms listed alphabetically, with Linux included in the Unix
section.
Cross Platform
--------------
architecture(executable=sys.executable, bits='', linkage='')~
Queries the given executable (defaults to the Python interpreter binary) for
various architecture information.
Returns a tuple ``(bits, linkage)`` which contain information about the bit
architecture and the linkage format used for the executable. Both values are
returned as strings.
Values that cannot be determined are returned as given by the parameter presets.
If bits is given as ``''``, the sizeof(pointer) (or
sizeof(long) on Python version < 1.5.2) is used as indicator for the
supported pointer size.
The function relies on the system's file command to do the actual work.
This is available on most if not all Unix platforms and some non-Unix platforms
and then only if the executable points to the Python interpreter. Reasonable
defaults are used when the above needs are not met.
machine()~
Returns the machine type, e.g. ``'i386'``. An empty string is returned if the
value cannot be determined.
node()~
Returns the computer's network name (may not be fully qualified!). An empty
string is returned if the value cannot be determined.
platform(aliased=0, terse=0)~
Returns a single string identifying the underlying platform with as much useful
information as possible.
The output is intended to be {human readable} rather than machine parseable. It
may look different on different platforms and this is intended.
If {aliased} is true, the function will use aliases for various platforms that
report system names which differ from their common names, for example SunOS will
be reported as Solaris. The system_alias function is used to implement
this.
Setting {terse} to true causes the function to return only the absolute minimum
information needed to identify the platform.
processor()~
Returns the (real) processor name, e.g. ``'amdk6'``.
An empty string is returned if the value cannot be determined. Note that many
platforms do not provide this information or simply return the same value as for
machine. NetBSD does this.
python_build()~
Returns a tuple ``(buildno, builddate)`` stating the Python build number and
date as strings.
python_compiler()~
Returns a string identifying the compiler used for compiling Python.
python_branch()~
Returns a string identifying the Python implementation SCM branch.
.. versionadded:: 2.6
python_implementation()~
Returns a string identifying the Python implementation. Possible return values
are: 'CPython', 'IronPython', 'Jython'.
.. versionadded:: 2.6
python_revision()~
Returns a string identifying the Python implementation SCM revision.
.. versionadded:: 2.6
python_version()~
Returns the Python version as string ``'major.minor.patchlevel'``
Note that unlike the Python ``sys.version``, the returned value will always
include the patchlevel (it defaults to 0).
python_version_tuple()~
Returns the Python version as tuple ``(major, minor, patchlevel)`` of strings.
Note that unlike the Python ``sys.version``, the returned value will always
include the patchlevel (it defaults to ``'0'``).
release()~
Returns the system's release, e.g. ``'2.2.0'`` or ``'NT'`` An empty string is
returned if the value cannot be determined.
system()~
Returns the system/OS name, e.g. ``'Linux'``, ``'Windows'``, or ``'Java'``. An
empty string is returned if the value cannot be determined.
system_alias(system, release, version)~
Returns ``(system, release, version)`` aliased to common marketing names used
for some systems. It also does some reordering of the information in some cases
where it would otherwise cause confusion.
version()~
Returns the system's release version, e.g. ``'#3 on degas'``. An empty string is
returned if the value cannot be determined.
uname()~
Fairly portable uname interface. Returns a tuple of strings ``(system, node,
release, version, machine, processor)`` identifying the underlying platform.
Note that unlike the os.uname function this also returns possible
processor information as additional tuple entry.
Entries which cannot be determined are set to ``''``.
Java Platform
-------------
java_ver(release='', vendor='', vminfo=('','',''), osinfo=('','',''))~
Version interface for Jython.
Returns a tuple ``(release, vendor, vminfo, osinfo)`` with {vminfo} being a
tuple ``(vm_name, vm_release, vm_vendor)`` and {osinfo} being a tuple
``(os_name, os_version, os_arch)``. Values which cannot be determined are set to
the defaults given as parameters (which all default to ``''``).
Windows Platform
----------------
win32_ver(release='', version='', csd='', ptype='')~
Get additional version information from the Windows Registry and return a tuple
``(version, csd, ptype)`` referring to version number, CSD level and OS type
(multi/single processor).
As a hint: {ptype} is ``'Uniprocessor Free'`` on single processor NT machines
and ``'Multiprocessor Free'`` on multi processor machines. The {'Free'} refers
to the OS version being free of debugging code. It could also state {'Checked'}
which means the OS version uses debugging code, i.e. code that checks arguments,
ranges, etc.
.. note:: >
Note: this function works best with Mark Hammond's
win32all package installed, but also on Python 2.3 and
later (support for this was added in Python 2.6). It obviously
only runs on Win32 compatible platforms.
<
Win95/98 specific
popen(cmd, mode='r', bufsize=None)~
Portable popen interface. Find a working popen implementation
preferring win32pipe.popen. On Windows NT, win32pipe.popen
should work; on Windows 9x it hangs due to bugs in the MS C library.
Mac OS Platform
---------------
mac_ver(release='', versioninfo=('','',''), machine='')~
Get Mac OS version information and return it as tuple ``(release, versioninfo,
machine)`` with {versioninfo} being a tuple ``(version, dev_stage,
non_release_version)``.
Entries which cannot be determined are set to ``''``. All tuple entries are
strings.
Documentation for the underlying gestalt API is available online at
http://www.rgaros.nl/gestalt/.
Unix Platforms
--------------
dist(distname='', version='', id='', supported_dists=('SuSE','debian','redhat','mandrake',...))~
This is an old version of the functionality now provided by
linux_distribution. For new code, please use the
linux_distribution.
The only difference between the two is that ``dist()`` always
returns the short name of the distribution taken from the
``supported_dists`` parameter.
2.6~
linux_distribution(distname='', version='', id='', supported_dists=('SuSE','debian','redhat','mandrake',...), full_distribution_name=1)~
Tries to determine the name of the Linux OS distribution name.
``supported_dists`` may be given to define the set of Linux distributions to
look for. It defaults to a list of currently supported Linux distributions
identified by their release file name.
If ``full_distribution_name`` is true (default), the full distribution read
from the OS is returned. Otherwise the short name taken from
``supported_dists`` is used.
Returns a tuple ``(distname,version,id)`` which defaults to the args given as
parameters. ``id`` is the item in parentheses after the version number. It
is usually the version codename.
.. versionadded:: 2.6
libc_ver(executable=sys.executable, lib='', version='', chunksize=2048)~
Tries to determine the libc version against which the file executable (defaults
to the Python interpreter) is linked. Returns a tuple of strings ``(lib,
version)`` which default to the given parameters in case the lookup fails.
Note that this function has intimate knowledge of how different libc versions
add symbols to the executable is probably only usable for executables compiled
using gcc.
The file is read and scanned in chunks of {chunksize} bytes.
==============================================================================
*py2stdlib-plistlib*
plistlib~
:synopsis: Generate and parse Mac OS X plist files.
.. (harvested from docstrings in the original file)
.. versionchanged:: 2.6
This module was previously only available in the Mac-specific library, it is
now available for all platforms.
.. index::
pair: plist; file
single: property list
This module provides an interface for reading and writing the "property list"
XML files used mainly by Mac OS X.
The property list (``.plist``) file format is a simple XML pickle supporting
basic object types, like dictionaries, lists, numbers and strings. Usually the
top level object is a dictionary.
Values can be strings, integers, floats, booleans, tuples, lists, dictionaries
(but only with string keys), Data or datetime.datetime
objects. String values (including dictionary keys) may be unicode strings --
they will be written out as UTF-8.
The ``<data>`` plist type is supported through the Data class. This is
a thin wrapper around a Python string. Use Data if your strings
contain control characters.
.. seealso::
`PList manual page <http://developer.apple.com/documentation/Darwin/Reference/ManPages/man5/plist.5.html>`_
Apple's documentation of the file format.
This module defines the following functions:
readPlist(pathOrFile)~
Read a plist file. {pathOrFile} may either be a file name or a (readable)
file object. Return the unpacked root object (which usually is a
dictionary).
The XML data is parsed using the Expat parser from xml.parsers.expat (|py2stdlib-xml.parsers.expat|)
-- see its documentation for possible exceptions on ill-formed XML.
Unknown elements will simply be ignored by the plist parser.
writePlist(rootObject, pathOrFile)~
Write {rootObject} to a plist file. {pathOrFile} may either be a file name
or a (writable) file object.
A TypeError will be raised if the object is of an unsupported type or
a container that contains objects of unsupported types.
readPlistFromString(data)~
Read a plist from a string. Return the root object.
writePlistToString(rootObject)~
Return {rootObject} as a plist-formatted string.
readPlistFromResource(path[, restype='plst'[, resid=0]])~
Read a plist from the resource with type {restype} from the resource fork of
{path}. Availability: Mac OS X.
.. note:: >
In Python 3.x, this function has been removed.
<
writePlistToResource(rootObject, path[, restype='plst'[, resid=0]])~
Write {rootObject} as a resource with type {restype} to the resource fork of
{path}. Availability: Mac OS X.
.. note:: >
In Python 3.x, this function has been removed.
<
The following class is available:
Data(data)~
Return a "data" wrapper object around the string {data}. This is used in
functions converting from/to plists to represent the ``<data>`` type
available in plists.
It has one attribute, data, that can be used to retrieve the Python
string stored in it.
Examples
--------
Generating a plist:: >
pl = dict(
aString="Doodah",
aList=["A", "B", 12, 32.1, [1, 2, 3]],
aFloat = 0.1,
anInt = 728,
aDict=dict(
anotherString="<hello & hi there!>",
aUnicodeValue=u'M\xe4ssig, Ma\xdf',
aTrueValue=True,
aFalseValue=False,
),
someData = Data("<binary gunk>"),
someMoreData = Data("<lots of binary gunk>" * 10),
aDate = datetime.datetime.fromtimestamp(time.mktime(time.gmtime())),
)
# unicode keys are possible, but a little awkward to use:
pl[u'\xc5benraa'] = "That was a unicode key."
writePlist(pl, fileName)
<
Parsing a plist::
pl = readPlist(pathOrFile)
print pl["aKey"]
==============================================================================
*py2stdlib-popen2*
popen2~
:synopsis: Subprocesses with accessible standard I/O streams.
:deprecated:
2.6~
This module is obsolete. Use the subprocess (|py2stdlib-subprocess|) module. Check
especially the subprocess-replacements section.
This module allows you to spawn processes and connect to their
input/output/error pipes and obtain their return codes under Unix and Windows.
The subprocess (|py2stdlib-subprocess|) module provides more powerful facilities for spawning new
processes and retrieving their results. Using the subprocess (|py2stdlib-subprocess|) module is
preferable to using the popen2 (|py2stdlib-popen2|) module.
The primary interface offered by this module is a trio of factory functions.
For each of these, if {bufsize} is specified, it specifies the buffer size for
the I/O pipes. {mode}, if provided, should be the string ``'b'`` or ``'t'``; on
Windows this is needed to determine whether the file objects should be opened in
binary or text mode. The default value for {mode} is ``'t'``.
On Unix, {cmd} may be a sequence, in which case arguments will be passed
directly to the program without shell intervention (as with os.spawnv).
If {cmd} is a string it will be passed to the shell (as with os.system).
The only way to retrieve the return codes for the child processes is by using
the poll or wait methods on the Popen3 and
Popen4 classes; these are only available on Unix. This information is
not available when using the popen2 (|py2stdlib-popen2|), popen3, and popen4
functions, or the equivalent functions in the os (|py2stdlib-os|) module. (Note that the
tuples returned by the os (|py2stdlib-os|) module's functions are in a different order
from the ones returned by the popen2 (|py2stdlib-popen2|) module.)
popen2(cmd[, bufsize[, mode]])~
Executes {cmd} as a sub-process. Returns the file objects ``(child_stdout,
child_stdin)``.
popen3(cmd[, bufsize[, mode]])~
Executes {cmd} as a sub-process. Returns the file objects ``(child_stdout,
child_stdin, child_stderr)``.
popen4(cmd[, bufsize[, mode]])~
Executes {cmd} as a sub-process. Returns the file objects
``(child_stdout_and_stderr, child_stdin)``.
.. versionadded:: 2.0
On Unix, a class defining the objects returned by the factory functions is also
available. These are not used for the Windows implementation, and are not
available on that platform.
Popen3(cmd[, capturestderr[, bufsize]])~
This class represents a child process. Normally, Popen3 instances are
created using the popen2 (|py2stdlib-popen2|) and popen3 factory functions described
above.
If not using one of the helper functions to create Popen3 objects, the
parameter {cmd} is the shell command to execute in a sub-process. The
{capturestderr} flag, if true, specifies that the object should capture standard
error output of the child process. The default is false. If the {bufsize}
parameter is specified, it specifies the size of the I/O buffers to/from the
child process.
Popen4(cmd[, bufsize])~
Similar to Popen3, but always captures standard error into the same
file object as standard output. These are typically created using
popen4.
.. versionadded:: 2.0
Popen3 and Popen4 Objects
-------------------------
Instances of the Popen3 and Popen4 classes have the following
methods:
Popen3.poll()~
Returns ``-1`` if child process hasn't completed yet, or its status code
(see wait) otherwise.
Popen3.wait()~
Waits for and returns the status code of the child process. The status code
encodes both the return code of the process and information about whether it
exited using the exit system call or died due to a signal. Functions
to help interpret the status code are defined in the os (|py2stdlib-os|) module; see
section os-process for the W\* family of functions.
The following attributes are also available:
Popen3.fromchild~
A file object that provides output from the child process. For Popen4
instances, this will provide both the standard output and standard error
streams.
Popen3.tochild~
A file object that provides input to the child process.
Popen3.childerr~
A file object that provides error output from the child process, if
{capturestderr} was true for the constructor, otherwise ``None``. This will
always be ``None`` for Popen4 instances.
Popen3.pid~
The process ID of the child process.
Flow Control Issues
-------------------
Any time you are working with any form of inter-process communication, control
flow needs to be carefully thought out. This remains the case with the file
objects provided by this module (or the os (|py2stdlib-os|) module equivalents).
When reading output from a child process that writes a lot of data to standard
error while the parent is reading from the child's standard output, a deadlock
can occur. A similar situation can occur with other combinations of reads and
writes. The essential factors are that more than _PC_PIPE_BUF bytes
are being written by one process in a blocking fashion, while the other process
is reading from the first process, also in a blocking fashion.
.. Example explanation and suggested work-arounds substantially stolen
from Martin von Löwis:
http://mail.python.org/pipermail/python-dev/2000-September/009460.html
There are several ways to deal with this situation.
The simplest application change, in many cases, will be to follow this model in
the parent process:: >
import popen2
r, w, e = popen2.popen3('python slave.py')
e.readlines()
r.readlines()
r.close()
e.close()
w.close()
<
with code like this in the child::
import os
import sys
# note that each of these print statements
# writes a single long string
print >>sys.stderr, 400 * 'this is a test\n'
os.close(sys.stderr.fileno())
print >>sys.stdout, 400 * 'this is another test\n'
In particular, note that ``sys.stderr`` must be closed after writing all data,
or readlines won't return. Also note that os.close must be
used, as ``sys.stderr.close()`` won't close ``stderr`` (otherwise assigning to
``sys.stderr`` will silently close it, so no further errors can be printed).
Applications which need to support a more general approach should integrate I/O
over pipes with their select (|py2stdlib-select|) loops, or use separate threads to read each
of the individual files provided by whichever popen\* function or
Popen\* class was used.
.. seealso::
Module subprocess (|py2stdlib-subprocess|)
Module for spawning and managing subprocesses.
==============================================================================
*py2stdlib-poplib*
poplib~
:synopsis: POP3 protocol client (requires sockets).
.. revised by ESR, January 2000
.. index:: pair: POP3; protocol
This module defines a class, POP3, which encapsulates a connection to a
POP3 server and implements the protocol as defined in 1725. The
POP3 class supports both the minimal and optional command sets.
Additionally, this module provides a class POP3_SSL, which provides
support for connecting to POP3 servers that use SSL as an underlying protocol
layer.
Note that POP3, though widely supported, is obsolescent. The implementation
quality of POP3 servers varies widely, and too many are quite poor. If your
mailserver supports IMAP, you would be better off using the
imaplib.IMAP4 class, as IMAP servers tend to be better implemented.
A single class is provided by the poplib (|py2stdlib-poplib|) module:
POP3(host[, port[, timeout]])~
This class implements the actual POP3 protocol. The connection is created when
the instance is initialized. If {port} is omitted, the standard POP3 port (110)
is used. The optional {timeout} parameter specifies a timeout in seconds for the
connection attempt (if not specified, the global default timeout setting will
be used).
.. versionchanged:: 2.6
{timeout} was added.
POP3_SSL(host[, port[, keyfile[, certfile]]])~
This is a subclass of POP3 that connects to the server over an SSL
encrypted socket. If {port} is not specified, 995, the standard POP3-over-SSL
port is used. {keyfile} and {certfile} are also optional - they can contain a
PEM formatted private key and certificate chain file for the SSL connection.
.. versionadded:: 2.4
One exception is defined as an attribute of the poplib (|py2stdlib-poplib|) module:
error_proto~
Exception raised on any errors from this module (errors from socket (|py2stdlib-socket|)
module are not caught). The reason for the exception is passed to the
constructor as a string.
.. seealso::
Module imaplib (|py2stdlib-imaplib|)
The standard Python IMAP module.
`Frequently Asked Questions About Fetchmail <http://www.catb.org/~esr/fetchmail/fetchmail-FAQ.html>`_
The FAQ for the fetchmail POP/IMAP client collects information on
POP3 server variations and RFC noncompliance that may be useful if you need to
write an application based on the POP protocol.
POP3 Objects
------------
All POP3 commands are represented by methods of the same name, in lower-case;
most return the response text sent by the server.
An POP3 instance has the following methods:
POP3.set_debuglevel(level)~
Set the instance's debugging level. This controls the amount of debugging
output printed. The default, ``0``, produces no debugging output. A value of
``1`` produces a moderate amount of debugging output, generally a single line
per request. A value of ``2`` or higher produces the maximum amount of
debugging output, logging each line sent and received on the control connection.
POP3.getwelcome()~
Returns the greeting string sent by the POP3 server.
POP3.user(username)~
Send user command, response should indicate that a password is required.
POP3.pass_(password)~
Send password, response includes message count and mailbox size. Note: the
mailbox on the server is locked until quit is called.
POP3.apop(user, secret)~
Use the more secure APOP authentication to log into the POP3 server.
POP3.rpop(user)~
Use RPOP authentication (similar to UNIX r-commands) to log into POP3 server.
POP3.stat()~
Get mailbox status. The result is a tuple of 2 integers: ``(message count,
mailbox size)``.
POP3.list([which])~
Request message list, result is in the form ``(response, ['mesg_num octets',
...], octets)``. If {which} is set, it is the message to list.
POP3.retr(which)~
Retrieve whole message number {which}, and set its seen flag. Result is in form
``(response, ['line', ...], octets)``.
POP3.dele(which)~
Flag message number {which} for deletion. On most servers deletions are not
actually performed until QUIT (the major exception is Eudora QPOP, which
deliberately violates the RFCs by doing pending deletes on any disconnect).
POP3.rset()~
Remove any deletion marks for the mailbox.
POP3.noop()~
Do nothing. Might be used as a keep-alive.
POP3.quit()~
Signoff: commit changes, unlock mailbox, drop connection.
POP3.top(which, howmuch)~
Retrieves the message header plus {howmuch} lines of the message after the
header of message number {which}. Result is in form ``(response, ['line', ...],
octets)``.
The POP3 TOP command this method uses, unlike the RETR command, doesn't set the
message's seen flag; unfortunately, TOP is poorly specified in the RFCs and is
frequently broken in off-brand servers. Test this method by hand against the
POP3 servers you will use before trusting it.
POP3.uidl([which])~
Return message digest (unique id) list. If {which} is specified, result contains
the unique id for that message in the form ``'response mesgnum uid``, otherwise
result is list ``(response, ['mesgnum uid', ...], octets)``.
Instances of POP3_SSL have no additional methods. The interface of this
subclass is identical to its parent.
POP3 Example
------------
Here is a minimal example (without error checking) that opens a mailbox and
retrieves and prints all messages:: >
import getpass, poplib
M = poplib.POP3('localhost')
M.user(getpass.getuser())
M.pass_(getpass.getpass())
numMessages = len(M.list()[1])
for i in range(numMessages):
for j in M.retr(i+1)[1]:
print j
<
At the end of the module, there is a test section that contains a more extensive
example of usage.
==============================================================================
*py2stdlib-posix*
posix~
:platform: Unix
:synopsis: The most common POSIX system calls (normally used via module os).
This module provides access to operating system functionality that is
standardized by the C Standard and the POSIX standard (a thinly disguised Unix
interface).
.. index:: module: os
{Do not import this module directly.}* Instead, import the module os (|py2stdlib-os|),
which provides a {portable} version of this interface. On Unix, the os (|py2stdlib-os|)
module provides a superset of the posix (|py2stdlib-posix|) interface. On non-Unix operating
systems the posix (|py2stdlib-posix|) module is not available, but a subset is always
available through the os (|py2stdlib-os|) interface. Once os (|py2stdlib-os|) is imported, there is
{no} performance penalty in using it instead of posix (|py2stdlib-posix|). In addition,
os (|py2stdlib-os|) provides some additional functionality, such as automatically calling
putenv when an entry in ``os.environ`` is changed.
Errors are reported as exceptions; the usual exceptions are given for type
errors, while errors reported by the system calls raise OSError.
Large File Support
------------------
.. index::
single: large files
single: file; large files
Several operating systems (including AIX, HP-UX, Irix and Solaris) provide
support for files that are larger than 2 GB from a C programming model where
int and long are 32-bit values. This is typically accomplished
by defining the relevant size and offset types as 64-bit values. Such files are
sometimes referred to as large files.
Large file support is enabled in Python when the size of an off_t is
larger than a long and the long long type is available and is
at least as large as an off_t. Python longs are then used to represent
file sizes, offsets and other values that can exceed the range of a Python int.
It may be necessary to configure and compile Python with certain compiler flags
to enable this mode. For example, it is enabled by default with recent versions
of Irix, but with Solaris 2.6 and 2.7 you need to do something like:: >
CFLAGS="`getconf LFS_CFLAGS`" OPT="-g -O2 $CFLAGS" \
./configure
<
On large-file-capable Linux systems, this might work::
CFLAGS='-D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64' OPT="-g -O2 $CFLAGS" \
./configure
Notable Module Contents
-----------------------
In addition to many functions described in the os (|py2stdlib-os|) module documentation,
posix (|py2stdlib-posix|) defines the following data item:
environ~
A dictionary representing the string environment at the time the interpreter
was started. For example, ``environ['HOME']`` is the pathname of your home
directory, equivalent to ``getenv("HOME")`` in C.
Modifying this dictionary does not affect the string environment passed on by
execv, popen or system; if you need to change the
environment, pass ``environ`` to execve or add variable assignments and
export statements to the command string for system or popen.
.. note:: >
The os (|py2stdlib-os|) module provides an alternate implementation of ``environ`` which
updates the environment on modification. Note also that updating ``os.environ``
will render this dictionary obsolete. Use of the os (|py2stdlib-os|) module version of
this is recommended over direct access to the posix (|py2stdlib-posix|) module.
==============================================================================
*py2stdlib-posixfile*
posixfile~
:platform: Unix
:synopsis: A file-like object with support for locking.
:deprecated:
.. index:: pair: POSIX; file object
1.5~
The locking operation that this module provides is done better and more portably
by the fcntl.lockf call.
.. index:: single: fcntl() (in module fcntl)
This module implements some additional functionality over the built-in file
objects. In particular, it implements file locking, control over the file
flags, and an easy interface to duplicate the file object. The module defines a
new file object, the posixfile object. It has all the standard file object
methods and adds the methods described below. This module only works for
certain flavors of Unix, since it uses fcntl.fcntl for file locking.
To instantiate a posixfile object, use the posixfile.open function. The
resulting object looks and feels roughly the same as a standard file object.
The posixfile (|py2stdlib-posixfile|) module defines the following constants:
SEEK_SET~
Offset is calculated from the start of the file.
SEEK_CUR~
Offset is calculated from the current position in the file.
SEEK_END~
Offset is calculated from the end of the file.
The posixfile (|py2stdlib-posixfile|) module defines the following functions:
open(filename[, mode[, bufsize]])~
Create a new posixfile object with the given filename and mode. The {filename},
{mode} and {bufsize} arguments are interpreted the same way as by the built-in
open function.
fileopen(fileobject)~
Create a new posixfile object with the given standard file object. The resulting
object has the same filename and mode as the original file object.
The posixfile object defines the following additional methods:
posixfile.lock(fmt, [len[, start[, whence]]])~
Lock the specified section of the file that the file object is referring to.
The format is explained below in a table. The {len} argument specifies the
length of the section that should be locked. The default is ``0``. {start}
specifies the starting offset of the section, where the default is ``0``. The
{whence} argument specifies where the offset is relative to. It accepts one of
the constants SEEK_SET, SEEK_CUR or SEEK_END. The
default is SEEK_SET. For more information about the arguments refer to
the fcntl(2) manual page on your system.
posixfile.flags([flags])~
Set the specified flags for the file that the file object is referring to. The
new flags are ORed with the old flags, unless specified otherwise. The format
is explained below in a table. Without the {flags} argument a string indicating
the current flags is returned (this is the same as the ``?`` modifier). For
more information about the flags refer to the fcntl(2) manual page on
your system.
posixfile.dup()~
Duplicate the file object and the underlying file pointer and file descriptor.
The resulting object behaves as if it were newly opened.
posixfile.dup2(fd)~
Duplicate the file object and the underlying file pointer and file descriptor.
The new object will have the given file descriptor. Otherwise the resulting
object behaves as if it were newly opened.
posixfile.file()~
Return the standard file object that the posixfile object is based on. This is
sometimes necessary for functions that insist on a standard file object.
All methods raise IOError when the request fails.
Format characters for the lock method have the following meaning:
+--------+-----------------------------------------------+
| Format | Meaning |
+========+===============================================+
| ``u`` | unlock the specified region |
+--------+-----------------------------------------------+
| ``r`` | request a read lock for the specified section |
+--------+-----------------------------------------------+
| ``w`` | request a write lock for the specified |
| | section |
+--------+-----------------------------------------------+
In addition the following modifiers can be added to the format:
+----------+--------------------------------+-------+
| Modifier | Meaning | Notes |
+==========+================================+=======+
| ``|`` | wait until the lock has been | |
| | granted | |
+----------+--------------------------------+-------+
| ``?`` | return the first lock | \(1) |
| | conflicting with the requested | |
| | lock, or ``None`` if there is | |
| | no conflict. | |
+----------+--------------------------------+-------+
Note:
(1)
The lock returned is in the format ``(mode, len, start, whence, pid)`` where
{mode} is a character representing the type of lock ('r' or 'w'). This modifier
prevents a request from being granted; it is for query purposes only.
Format characters for the flags method have the following meanings:
+--------+-----------------------------------------------+
| Format | Meaning |
+========+===============================================+
| ``a`` | append only flag |
+--------+-----------------------------------------------+
| ``c`` | close on exec flag |
+--------+-----------------------------------------------+
| ``n`` | no delay flag (also called non-blocking flag) |
+--------+-----------------------------------------------+
| ``s`` | synchronization flag |
+--------+-----------------------------------------------+
In addition the following modifiers can be added to the format:
+----------+---------------------------------+-------+
| Modifier | Meaning | Notes |
+==========+=================================+=======+
| ``!`` | turn the specified flags 'off', | \(1) |
| | instead of the default 'on' | |
+----------+---------------------------------+-------+
| ``=`` | replace the flags, instead of | \(1) |
| | the default 'OR' operation | |
+----------+---------------------------------+-------+
| ``?`` | return a string in which the | \(2) |
| | characters represent the flags | |
| | that are set. | |
+----------+---------------------------------+-------+
Notes:
(1)
The ``!`` and ``=`` modifiers are mutually exclusive.
(2)
This string represents the flags after they may have been altered by the same
call.
Examples:: >
import posixfile
file = posixfile.open('/tmp/test', 'w')
file.lock('w|')
...
file.lock('u')
file.close()
==============================================================================
*py2stdlib-pprint*
pprint~
:synopsis: Data pretty printer.
The pprint (|py2stdlib-pprint|) module provides a capability to "pretty-print" arbitrary
Python data structures in a form which can be used as input to the interpreter.
If the formatted structures include objects which are not fundamental Python
types, the representation may not be loadable. This may be the case if objects
such as files, sockets, classes, or instances are included, as well as many
other built-in objects which are not representable as Python constants.
The formatted representation keeps objects on a single line if it can, and
breaks them onto multiple lines if they don't fit within the allowed width.
Construct PrettyPrinter objects explicitly if you need to adjust the
width constraint.
.. versionchanged:: 2.5
Dictionaries are sorted by key before the display is computed; before 2.5, a
dictionary was sorted only if its display required more than one line, although
that wasn't documented.
.. versionchanged:: 2.6
Added support for set and frozenset.
The pprint (|py2stdlib-pprint|) module defines one class:
.. First the implementation class:
PrettyPrinter(...)~
Construct a PrettyPrinter instance. This constructor understands
several keyword parameters. An output stream may be set using the {stream}
keyword; the only method used on the stream object is the file protocol's
write method. If not specified, the PrettyPrinter adopts
``sys.stdout``. Three additional parameters may be used to control the
formatted representation. The keywords are {indent}, {depth}, and {width}. The
amount of indentation added for each recursive level is specified by {indent};
the default is one. Other values can cause output to look a little odd, but can
make nesting easier to spot. The number of levels which may be printed is
controlled by {depth}; if the data structure being printed is too deep, the next
contained level is replaced by ``...``. By default, there is no constraint on
the depth of the objects being formatted. The desired output width is
constrained using the {width} parameter; the default is 80 characters. If a
structure cannot be formatted within the constrained width, a best effort will
be made.
>>> import pprint
>>> stuff = ['spam', 'eggs', 'lumberjack', 'knights', 'ni']
>>> stuff.insert(0, stuff[:])
>>> pp = pprint.PrettyPrinter(indent=4)
>>> pp.pprint(stuff)
[ ['spam', 'eggs', 'lumberjack', 'knights', 'ni'],
'spam',
'eggs',
'lumberjack',
'knights',
'ni']
>>> tup = ('spam', ('eggs', ('lumberjack', ('knights', ('ni', ('dead',
... ('parrot', ('fresh fruit',))))))))
>>> pp = pprint.PrettyPrinter(depth=6)
>>> pp.pprint(tup)
('spam', ('eggs', ('lumberjack', ('knights', ('ni', ('dead', (...)))))))
The PrettyPrinter class supports several derivative functions:
.. Now the derivative functions:
pformat(object[, indent[, width[, depth]]])~
Return the formatted representation of {object} as a string. {indent}, {width}
and {depth} will be passed to the PrettyPrinter constructor as
formatting parameters.
.. versionchanged:: 2.4
The parameters {indent}, {width} and {depth} were added.
pprint(object[, stream[, indent[, width[, depth]]]])~
Prints the formatted representation of {object} on {stream}, followed by a
newline. If {stream} is omitted, ``sys.stdout`` is used. This may be used in
the interactive interpreter instead of a print statement for
inspecting values. {indent}, {width} and {depth} will be passed to the
PrettyPrinter constructor as formatting parameters.
>>> import pprint
>>> stuff = ['spam', 'eggs', 'lumberjack', 'knights', 'ni']
>>> stuff.insert(0, stuff)
>>> pprint.pprint(stuff)
[<Recursion on list with id=...>,
'spam',
'eggs',
'lumberjack',
'knights',
'ni']
.. versionchanged:: 2.4
The parameters {indent}, {width} and {depth} were added.
isreadable(object)~
.. index:: builtin: eval
Determine if the formatted representation of {object} is "readable," or can be
used to reconstruct the value using eval. This always returns ``False``
for recursive objects.
>>> pprint.isreadable(stuff)
False
isrecursive(object)~
Determine if {object} requires a recursive representation.
One more support function is also defined:
saferepr(object)~
Return a string representation of {object}, protected against recursive data
structures. If the representation of {object} exposes a recursive entry, the
recursive reference will be represented as ``<Recursion on typename with
id=number>``. The representation is not otherwise formatted.
>>> pprint.saferepr(stuff)
"[<Recursion on list with id=...>, 'spam', 'eggs', 'lumberjack', 'knights', 'ni']"
PrettyPrinter Objects
---------------------
PrettyPrinter instances have the following methods:
PrettyPrinter.pformat(object)~
Return the formatted representation of {object}. This takes into account the
options passed to the PrettyPrinter constructor.
PrettyPrinter.pprint(object)~
Print the formatted representation of {object} on the configured stream,
followed by a newline.
The following methods provide the implementations for the corresponding
functions of the same names. Using these methods on an instance is slightly
more efficient since new PrettyPrinter objects don't need to be
created.
PrettyPrinter.isreadable(object)~
.. index:: builtin: eval
Determine if the formatted representation of the object is "readable," or can be
used to reconstruct the value using eval. Note that this returns
``False`` for recursive objects. If the {depth} parameter of the
PrettyPrinter is set and the object is deeper than allowed, this
returns ``False``.
PrettyPrinter.isrecursive(object)~
Determine if the object requires a recursive representation.
This method is provided as a hook to allow subclasses to modify the way objects
are converted to strings. The default implementation uses the internals of the
saferepr implementation.
PrettyPrinter.format(object, context, maxlevels, level)~
Returns three values: the formatted version of {object} as a string, a flag
indicating whether the result is readable, and a flag indicating whether
recursion was detected. The first argument is the object to be presented. The
second is a dictionary which contains the id of objects that are part of
the current presentation context (direct and indirect containers for {object}
that are affecting the presentation) as the keys; if an object needs to be
presented which is already represented in {context}, the third return value
should be ``True``. Recursive calls to the format method should add
additional entries for containers to this dictionary. The third argument,
{maxlevels}, gives the requested limit to recursion; this will be ``0`` if there
is no requested limit. This argument should be passed unmodified to recursive
calls. The fourth argument, {level}, gives the current level; recursive calls
should be passed a value less than that of the current call.
.. versionadded:: 2.3
pprint Example
--------------
This example demonstrates several uses of the pprint (|py2stdlib-pprint|) function and its parameters.
>>> import pprint
>>> tup = ('spam', ('eggs', ('lumberjack', ('knights', ('ni', ('dead',
... ('parrot', ('fresh fruit',))))))))
>>> stuff = ['a' { 10, tup, ['a' } 30, 'b' { 30], ['c' } 20, 'd' * 20]]
>>> pprint.pprint(stuff)
['aaaaaaaaaa',
('spam',
('eggs',
('lumberjack',
('knights', ('ni', ('dead', ('parrot', ('fresh fruit',)))))))),
['aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa', 'bbbbbbbbbbbbbbbbbbbbbbbbbbbbbb'],
['cccccccccccccccccccc', 'dddddddddddddddddddd']]
>>> pprint.pprint(stuff, depth=3)
['aaaaaaaaaa',
('spam', ('eggs', (...))),
['aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa', 'bbbbbbbbbbbbbbbbbbbbbbbbbbbbbb'],
['cccccccccccccccccccc', 'dddddddddddddddddddd']]
>>> pprint.pprint(stuff, width=60)
['aaaaaaaaaa',
('spam',
('eggs',
('lumberjack',
('knights',
('ni', ('dead', ('parrot', ('fresh fruit',)))))))),
['aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa',
'bbbbbbbbbbbbbbbbbbbbbbbbbbbbbb'],
['cccccccccccccccccccc', 'dddddddddddddddddddd']]
==============================================================================
*py2stdlib-profile*
profile~
:synopsis: Python source profiler.
.. index:: single: InfoSeek Corporation
Copyright © 1994, by InfoSeek Corporation, all rights reserved.
Written by James Roskind. [#]_
Permission to use, copy, modify, and distribute this Python software and its
associated documentation for any purpose (subject to the restriction in the
following sentence) without fee is hereby granted, provided that the above
copyright notice appears in all copies, and that both that copyright notice and
this permission notice appear in supporting documentation, and that the name of
InfoSeek not be used in advertising or publicity pertaining to distribution of
the software without specific, written prior permission. This permission is
explicitly restricted to the copying and modification of the software to remain
in Python, compiled Python, or other languages (such as C) wherein the modified
or derived code is exclusively imported into a Python module.
INFOSEEK CORPORATION DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT
SHALL INFOSEEK CORPORATION BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL
DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING
OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
Introduction to the profilers
=============================
.. index::
single: deterministic profiling
single: profiling, deterministic
A profiler is a program that describes the run time performance
of a program, providing a variety of statistics. This documentation
describes the profiler functionality provided in the modules
cProfile (|py2stdlib-cprofile|), profile (|py2stdlib-profile|) and pstats (|py2stdlib-pstats|). This profiler
provides deterministic profiling of Python programs. It also
provides a series of report generation tools to allow users to rapidly
examine the results of a profile operation.
The Python standard library provides three different profilers:
#. cProfile (|py2stdlib-cprofile|) is recommended for most users; it's a C extension
with reasonable overhead
that makes it suitable for profiling long-running programs.
Based on lsprof,
contributed by Brett Rosen and Ted Czotter.
.. versionadded:: 2.5
#. profile (|py2stdlib-profile|), a pure Python module whose interface is imitated by
cProfile (|py2stdlib-cprofile|). Adds significant overhead to profiled programs.
If you're trying to extend
the profiler in some way, the task might be easier with this module.
Copyright © 1994, by InfoSeek Corporation.
.. versionchanged:: 2.4
Now also reports the time spent in calls to built-in functions and methods.
#. hotshot (|py2stdlib-hotshot|) was an experimental C module that focused on minimizing
the overhead of profiling, at the expense of longer data
post-processing times. It is no longer maintained and may be
dropped in a future version of Python.
.. versionchanged:: 2.5
The results should be more meaningful than in the past: the timing core
contained a critical bug.
The profile (|py2stdlib-profile|) and cProfile (|py2stdlib-cprofile|) modules export the same interface, so
they are mostly interchangeable; cProfile (|py2stdlib-cprofile|) has a much lower overhead but
is newer and might not be available on all systems.
cProfile (|py2stdlib-cprofile|) is really a compatibility layer on top of the internal
_lsprof module. The hotshot (|py2stdlib-hotshot|) module is reserved for specialized
usage.
Instant User's Manual
=====================
This section is provided for users that "don't want to read the manual." It
provides a very brief overview, and allows a user to rapidly perform profiling
on an existing application.
To profile an application with a main entry point of foo, you would add
the following to your module:: >
import cProfile
cProfile.run('foo()')
<
(Use profile (|py2stdlib-profile|) instead of cProfile (|py2stdlib-cprofile|) if the latter is not available on
your system.)
The above action would cause foo to be run, and a series of informative
lines (the profile) to be printed. The above approach is most useful when
working with the interpreter. If you would like to save the results of a
profile into a file for later examination, you can supply a file name as the
second argument to the run function:: >
import cProfile
cProfile.run('foo()', 'fooprof')
<
The file cProfile.py can also be invoked as a script to profile another
script. For example:: >
python -m cProfile myscript.py
<
cProfile.py accepts two optional arguments on the command line::
cProfile.py [-o output_file] [-s sort_order]
``-s`` only applies to standard output (``-o`` is not supplied).
Look in the Stats documentation for valid sort values.
When you wish to review the profile, you should use the methods in the
pstats (|py2stdlib-pstats|) module. Typically you would load the statistics data as follows:: >
import pstats
p = pstats.Stats('fooprof')
<
The class Stats (the above code just created an instance of this class)
has a variety of methods for manipulating and printing the data that was just
read into ``p``. When you ran cProfile.run above, what was printed was
the result of three method calls:: >
p.strip_dirs().sort_stats(-1).print_stats()
<
The first method removed the extraneous path from all the module names. The
second method sorted all the entries according to the standard module/line/name
string that is printed. The third method printed out all the statistics. You
might try the following sort calls:
.. (this is to comply with the semantics of the old profiler).
:: >
p.sort_stats('name')
p.print_stats()
<
The first call will actually sort the list by function name, and the second call
will print out the statistics. The following are some interesting calls to
experiment with:: >
p.sort_stats('cumulative').print_stats(10)
<
This sorts the profile by cumulative time in a function, and then only prints
the ten most significant lines. If you want to understand what algorithms are
taking time, the above line is what you would use.
If you were looking to see what functions were looping a lot, and taking a lot
of time, you would do:: >
p.sort_stats('time').print_stats(10)
<
to sort according to time spent within each function, and then print the
statistics for the top ten functions.
You might also try:: >
p.sort_stats('file').print_stats('__init__')
<
This will sort all the statistics by file name, and then print out statistics
for only the class init methods (since they are spelled with ``__init__`` in
them). As one final example, you could try:: >
p.sort_stats('time', 'cum').print_stats(.5, 'init')
<
This line sorts statistics with a primary key of time, and a secondary key of
cumulative time, and then prints out some of the statistics. To be specific, the
list is first culled down to 50% (re: ``.5``) of its original size, then only
lines containing ``init`` are maintained, and that sub-sub-list is printed.
If you wondered what functions called the above functions, you could now (``p``
is still sorted according to the last criteria) do:: >
p.print_callers(.5, 'init')
<
and you would get a list of callers for each of the listed functions.
If you want more functionality, you're going to have to read the manual, or
guess what the following functions do:: >
p.print_callees()
p.add('fooprof')
<
Invoked as a script, the pstats (|py2stdlib-pstats|) module is a statistics browser for
reading and examining profile dumps. It has a simple line-oriented interface
(implemented using cmd (|py2stdlib-cmd|)) and interactive help.
What Is Deterministic Profiling?
================================
Deterministic profiling is meant to reflect the fact that all *function
call{, }function return{, and }exception* events are monitored, and precise
timings are made for the intervals between these events (during which time the
user's code is executing). In contrast, statistical profiling (which is
not done by this module) randomly samples the effective instruction pointer, and
deduces where time is being spent. The latter technique traditionally involves
less overhead (as the code does not need to be instrumented), but provides only
relative indications of where time is being spent.
In Python, since there is an interpreter active during execution, the presence
of instrumented code is not required to do deterministic profiling. Python
automatically provides a hook (optional callback) for each event. In
addition, the interpreted nature of Python tends to add so much overhead to
execution, that deterministic profiling tends to only add small processing
overhead in typical applications. The result is that deterministic profiling is
not that expensive, yet provides extensive run time statistics about the
execution of a Python program.
Call count statistics can be used to identify bugs in code (surprising counts),
and to identify possible inline-expansion points (high call counts). Internal
time statistics can be used to identify "hot loops" that should be carefully
optimized. Cumulative time statistics should be used to identify high level
errors in the selection of algorithms. Note that the unusual handling of
cumulative times in this profiler allows statistics for recursive
implementations of algorithms to be directly compared to iterative
implementations.
Reference Manual -- profile (|py2stdlib-profile|) and cProfile (|py2stdlib-cprofile|)
======================================================
==============================================================================
*py2stdlib-pstats*
pstats~
:synopsis: Statistics object for use with the profiler.
Stats(filename[, stream=sys.stdout[, ...]])~
This class constructor creates an instance of a "statistics object" from a
{filename} (or set of filenames). Stats objects are manipulated by
methods, in order to print useful reports. You may specify an alternate output
stream by giving the keyword argument, ``stream``.
The file selected by the above constructor must have been created by the
corresponding version of profile (|py2stdlib-profile|) or cProfile (|py2stdlib-cprofile|). To be specific,
there is {no} file compatibility guaranteed with future versions of this
profiler, and there is no compatibility with files produced by other profilers.
If several files are provided, all the statistics for identical functions will
be coalesced, so that an overall view of several processes can be considered in
a single report. If additional files need to be combined with data in an
existing Stats object, the add method can be used.
.. (such as the old system profiler).
.. versionchanged:: 2.5
The {stream} parameter was added.
The Stats Class
------------------------
Stats objects have the following methods:
Stats.strip_dirs()~
This method for the Stats class removes all leading path information
from file names. It is very useful in reducing the size of the printout to fit
within (close to) 80 columns. This method modifies the object, and the stripped
information is lost. After performing a strip operation, the object is
considered to have its entries in a "random" order, as it was just after object
initialization and loading. If strip_dirs causes two function names to
be indistinguishable (they are on the same line of the same filename, and have
the same function name), then the statistics for these two entries are
accumulated into a single entry.
Stats.add(filename[, ...])~
This method of the Stats class accumulates additional profiling
information into the current profiling object. Its arguments should refer to
filenames created by the corresponding version of profile.run or
cProfile.run. Statistics for identically named (re: file, line, name)
functions are automatically accumulated into single function statistics.
Stats.dump_stats(filename)~
Save the data loaded into the Stats object to a file named {filename}.
The file is created if it does not exist, and is overwritten if it already
exists. This is equivalent to the method of the same name on the
profile.Profile and cProfile.Profile classes.
.. versionadded:: 2.3
Stats.sort_stats(key[, ...])~
This method modifies the Stats object by sorting it according to the
supplied criteria. The argument is typically a string identifying the basis of
a sort (example: ``'time'`` or ``'name'``).
When more than one key is provided, then additional keys are used as secondary
criteria when there is equality in all keys selected before them. For example,
``sort_stats('name', 'file')`` will sort all the entries according to their
function name, and resolve all ties (identical function names) by sorting by
file name.
Abbreviations can be used for any key names, as long as the abbreviation is
unambiguous. The following are the keys currently defined:
+------------------+----------------------+
| Valid Arg | Meaning |
+==================+======================+
| ``'calls'`` | call count |
+------------------+----------------------+
| ``'cumulative'`` | cumulative time |
+------------------+----------------------+
| ``'file'`` | file name |
+------------------+----------------------+
| ``'module'`` | file name |
+------------------+----------------------+
| ``'pcalls'`` | primitive call count |
+------------------+----------------------+
| ``'line'`` | line number |
+------------------+----------------------+
| ``'name'`` | function name |
+------------------+----------------------+
| ``'nfl'`` | name/file/line |
+------------------+----------------------+
| ``'stdname'`` | standard name |
+------------------+----------------------+
| ``'time'`` | internal time |
+------------------+----------------------+
Note that all sorts on statistics are in descending order (placing most time
consuming items first), where as name, file, and line number searches are in
ascending order (alphabetical). The subtle distinction between ``'nfl'`` and
``'stdname'`` is that the standard name is a sort of the name as printed, which
means that the embedded line numbers get compared in an odd way. For example,
lines 3, 20, and 40 would (if the file names were the same) appear in the string
order 20, 3 and 40. In contrast, ``'nfl'`` does a numeric compare of the line
numbers. In fact, ``sort_stats('nfl')`` is the same as ``sort_stats('name',
'file', 'line')``.
For backward-compatibility reasons, the numeric arguments ``-1``, ``0``, ``1``,
and ``2`` are permitted. They are interpreted as ``'stdname'``, ``'calls'``,
``'time'``, and ``'cumulative'`` respectively. If this old style format
(numeric) is used, only one sort key (the numeric key) will be used, and
additional arguments will be silently ignored.
.. For compatibility with the old profiler,
Stats.reverse_order()~
This method for the Stats class reverses the ordering of the basic list
within the object. Note that by default ascending vs descending order is
properly selected based on the sort key of choice.
.. This method is provided primarily for compatibility with the old profiler.
Stats.print_stats([restriction, ...])~
This method for the Stats class prints out a report as described in the
profile.run definition.
The order of the printing is based on the last sort_stats operation done
on the object (subject to caveats in add and strip_dirs).
The arguments provided (if any) can be used to limit the list down to the
significant entries. Initially, the list is taken to be the complete set of
profiled functions. Each restriction is either an integer (to select a count of
lines), or a decimal fraction between 0.0 and 1.0 inclusive (to select a
percentage of lines), or a regular expression (to pattern match the standard
name that is printed; as of Python 1.5b1, this uses the Perl-style regular
expression syntax defined by the re (|py2stdlib-re|) module). If several restrictions are
provided, then they are applied sequentially. For example:: >
print_stats(.1, 'foo:')
<
would first limit the printing to first 10% of list, and then only print
functions that were part of filename .\*foo:. In contrast, the
command:: >
print_stats('foo:', .1)
<
would limit the list to all functions having file names .\*foo:, and
then proceed to only print the first 10% of them.
Stats.print_callers([restriction, ...])~
This method for the Stats class prints a list of all functions that
called each function in the profiled database. The ordering is identical to
that provided by print_stats, and the definition of the restricting
argument is also identical. Each caller is reported on its own line. The
format differs slightly depending on the profiler that produced the stats:
* With profile (|py2stdlib-profile|), a number is shown in parentheses after each caller to
show how many times this specific call was made. For convenience, a second
non-parenthesized number repeats the cumulative time spent in the function
at the right.
* With cProfile (|py2stdlib-cprofile|), each caller is preceded by three numbers: the number of
times this specific call was made, and the total and cumulative times spent in
the current function while it was invoked by this specific caller.
Stats.print_callees([restriction, ...])~
This method for the Stats class prints a list of all function that were
called by the indicated function. Aside from this reversal of direction of
calls (re: called vs was called by), the arguments and ordering are identical to
the print_callers method.
Limitations
===========
One limitation has to do with accuracy of timing information. There is a
fundamental problem with deterministic profilers involving accuracy. The most
obvious restriction is that the underlying "clock" is only ticking at a rate
(typically) of about .001 seconds. Hence no measurements will be more accurate
than the underlying clock. If enough measurements are taken, then the "error"
will tend to average out. Unfortunately, removing this first error induces a
second source of error.
The second problem is that it "takes a while" from when an event is dispatched
until the profiler's call to get the time actually {gets} the state of the
clock. Similarly, there is a certain lag when exiting the profiler event
handler from the time that the clock's value was obtained (and then squirreled
away), until the user's code is once again executing. As a result, functions
that are called many times, or call many functions, will typically accumulate
this error. The error that accumulates in this fashion is typically less than
the accuracy of the clock (less than one clock tick), but it {can} accumulate
and become very significant.
The problem is more important with profile (|py2stdlib-profile|) than with the lower-overhead
cProfile (|py2stdlib-cprofile|). For this reason, profile (|py2stdlib-profile|) provides a means of
calibrating itself for a given platform so that this error can be
probabilistically (on the average) removed. After the profiler is calibrated, it
will be more accurate (in a least square sense), but it will sometimes produce
negative numbers (when call counts are exceptionally low, and the gods of
probability work against you :-). ) Do {not} be alarmed by negative numbers in
the profile. They should {only} appear if you have calibrated your profiler,
and the results are actually better than without calibration.
Calibration
===========
The profiler of the profile (|py2stdlib-profile|) module subtracts a constant from each event
handling time to compensate for the overhead of calling the time function, and
socking away the results. By default, the constant is 0. The following
procedure can be used to obtain a better constant for a given platform (see
discussion in section Limitations above). :: >
import profile
pr = profile.Profile()
for i in range(5):
print pr.calibrate(10000)
<
The method executes the number of Python calls given by the argument, directly
and again under the profiler, measuring the time for both. It then computes the
hidden overhead per profiler event, and returns that as a float. For example,
on an 800 MHz Pentium running Windows 2000, and using Python's time.clock() as
the timer, the magical number is about 12.5e-6.
The object of this exercise is to get a fairly consistent result. If your
computer is {very} fast, or your timer function has poor resolution, you might
have to pass 100000, or even 1000000, to get consistent results.
When you have a consistent answer, there are three ways you can use it: [#]_ :: >
import profile
# 1. Apply computed bias to all Profile instances created hereafter.
profile.Profile.bias = your_computed_bias
# 2. Apply computed bias to a specific Profile instance.
pr = profile.Profile()
pr.bias = your_computed_bias
# 3. Specify computed bias in instance constructor.
pr = profile.Profile(bias=your_computed_bias)
<
If you have a choice, you are better off choosing a smaller constant, and then
your results will "less often" show up as negative in profile statistics.
Extensions --- Deriving Better Profilers
========================================
The Profile class of both modules, profile (|py2stdlib-profile|) and cProfile (|py2stdlib-cprofile|),
were written so that derived classes could be developed to extend the profiler.
The details are not described here, as doing this successfully requires an
expert understanding of how the Profile class works internally. Study
the source code of the module carefully if you want to pursue this.
If all you want to do is change how current time is determined (for example, to
force use of wall-clock time or elapsed process time), pass the timing function
you want to the Profile class constructor:: >
pr = profile.Profile(your_time_func)
<
The resulting profiler will then call your_time_func.
profile.Profile
your_time_func should return a single number, or a list of numbers whose
sum is the current time (like what os.times returns). If the function
returns a single time number, or the list of returned numbers has length 2, then
you will get an especially fast version of the dispatch routine.
Be warned that you should calibrate the profiler class for the timer function
that you choose. For most machines, a timer that returns a lone integer value
will provide the best results in terms of low overhead during profiling.
(os.times is {pretty} bad, as it returns a tuple of floating point
values). If you want to substitute a better timer in the cleanest fashion,
derive a class and hardwire a replacement dispatch method that best handles your
timer call, along with the appropriate calibration constant.
cProfile.Profile
your_time_func should return a single number. If it returns plain
integers, you can also invoke the class constructor with a second argument
specifying the real duration of one unit of time. For example, if
your_integer_time_func returns times measured in thousands of seconds,
you would constuct the Profile instance as follows:: >
pr = profile.Profile(your_integer_time_func, 0.001)
As the cProfile.Profile class cannot be calibrated, custom timer
functions should be used with care and should be as fast as possible. For the
best results with a custom timer, it might be necessary to hard-code it in the C
source of the internal _lsprof module.
<
.. rubric:: Footnotes
.. [#] Updated and converted to LaTeX by Guido van Rossum. Further updated by Armin
Rigo to integrate the documentation for the new cProfile (|py2stdlib-cprofile|) module of Python
2.5.
.. [#] Prior to Python 2.2, it was necessary to edit the profiler source code to embed
the bias as a literal number. You still can, but that method is no longer
described, because no longer needed.
==============================================================================
*py2stdlib-pty*
pty~
:platform: Linux
:synopsis: Pseudo-Terminal Handling for Linux.
The pty (|py2stdlib-pty|) module defines operations for handling the pseudo-terminal
concept: starting another process and being able to write to and read from its
controlling terminal programmatically.
Because pseudo-terminal handling is highly platform dependent, there is code to
do it only for Linux. (The Linux code is supposed to work on other platforms,
but hasn't been tested yet.)
The pty (|py2stdlib-pty|) module defines the following functions:
fork()~
Fork. Connect the child's controlling terminal to a pseudo-terminal. Return
value is ``(pid, fd)``. Note that the child gets {pid} 0, and the {fd} is
{invalid}. The parent's return value is the {pid} of the child, and {fd} is a
file descriptor connected to the child's controlling terminal (and also to the
child's standard input and output).
openpty()~
Open a new pseudo-terminal pair, using os.openpty if possible, or
emulation code for generic Unix systems. Return a pair of file descriptors
``(master, slave)``, for the master and the slave end, respectively.
spawn(argv[, master_read[, stdin_read]])~
Spawn a process, and connect its controlling terminal with the current
process's standard io. This is often used to baffle programs which insist on
reading from the controlling terminal.
The functions {master_read} and {stdin_read} should be functions which read from
a file descriptor. The defaults try to read 1024 bytes each time they are
called.
==============================================================================
*py2stdlib-pwd*
pwd~
:platform: Unix
:synopsis: The password database (getpwnam() and friends).
This module provides access to the Unix user account and password database. It
is available on all Unix versions.
Password database entries are reported as a tuple-like object, whose attributes
correspond to the members of the ``passwd`` structure (Attribute field below,
see ``<pwd.h>``):
+-------+---------------+-----------------------------+
| Index | Attribute | Meaning |
+=======+===============+=============================+
| 0 | ``pw_name`` | Login name |
+-------+---------------+-----------------------------+
| 1 | ``pw_passwd`` | Optional encrypted password |
+-------+---------------+-----------------------------+
| 2 | ``pw_uid`` | Numerical user ID |
+-------+---------------+-----------------------------+
| 3 | ``pw_gid`` | Numerical group ID |
+-------+---------------+-----------------------------+
| 4 | ``pw_gecos`` | User name or comment field |
+-------+---------------+-----------------------------+
| 5 | ``pw_dir`` | User home directory |
+-------+---------------+-----------------------------+
| 6 | ``pw_shell`` | User command interpreter |
+-------+---------------+-----------------------------+
The uid and gid items are integers, all others are strings. KeyError is
raised if the entry asked for cannot be found.
.. note::
.. index:: module: crypt
In traditional Unix the field ``pw_passwd`` usually contains a password
encrypted with a DES derived algorithm (see module crypt (|py2stdlib-crypt|)). However most
modern unices use a so-called {shadow password} system. On those unices the
{pw_passwd} field only contains an asterisk (``'*'``) or the letter ``'x'``
where the encrypted password is stored in a file /etc/shadow which is
not world readable. Whether the {pw_passwd} field contains anything useful is
system-dependent. If available, the spwd (|py2stdlib-spwd|) module should be used where
access to the encrypted password is required.
It defines the following items:
getpwuid(uid)~
Return the password database entry for the given numeric user ID.
getpwnam(name)~
Return the password database entry for the given user name.
getpwall()~
Return a list of all available password database entries, in arbitrary order.
.. seealso::
Module grp (|py2stdlib-grp|)
An interface to the group database, similar to this.
Module spwd (|py2stdlib-spwd|)
An interface to the shadow password database, similar to this.
==============================================================================
*py2stdlib-py_compile*
py_compile~
:synopsis: Generate byte-code files from Python source files.
.. documentation based on module docstrings
.. index:: pair: file; byte-code
The py_compile (|py2stdlib-py_compile|) module provides a function to generate a byte-code file
from a source file, and another function used when the module source file is
invoked as a script.
Though not often needed, this function can be useful when installing modules for
shared use, especially if some of the users may not have permission to write the
byte-code cache files in the directory containing the source code.
PyCompileError~
Exception raised when an error occurs while attempting to compile the file.
compile(file[, cfile[, dfile[, doraise]]])~
Compile a source file to byte-code and write out the byte-code cache file. The
source code is loaded from the file name {file}. The byte-code is written to
{cfile}, which defaults to {file} ``+`` ``'c'`` (``'o'`` if optimization is
enabled in the current interpreter). If {dfile} is specified, it is used as the
name of the source file in error messages instead of {file}. If {doraise} is
true, a PyCompileError is raised when an error is encountered while
compiling {file}. If {doraise} is false (the default), an error string is
written to ``sys.stderr``, but no exception is raised.
main([args])~
Compile several source files. The files named in {args} (or on the command
line, if {args} is not specified) are compiled and the resulting bytecode is
cached in the normal manner. This function does not search a directory
structure to locate source files; it only compiles files named explicitly.
When this module is run as a script, the main is used to compile all the
files named on the command line. The exit status is nonzero if one of the files
could not be compiled.
.. versionchanged:: 2.6
Added the nonzero exit status when module is run as a script.
.. seealso::
Module compileall (|py2stdlib-compileall|)
Utilities to compile all Python source files in a directory tree.
==============================================================================
*py2stdlib-pyclbr*
pyclbr~
:synopsis: Supports information extraction for a Python class browser.
The pyclbr (|py2stdlib-pyclbr|) module can be used to determine some limited information
about the classes, methods and top-level functions defined in a module. The
information provided is sufficient to implement a traditional three-pane
class browser. The information is extracted from the source code rather
than by importing the module, so this module is safe to use with untrusted
code. This restriction makes it impossible to use this module with modules
not implemented in Python, including all standard and optional extension
modules.
readmodule(module[, path=None])~
Read a module and return a dictionary mapping class names to class
descriptor objects. The parameter {module} should be the name of a
module as a string; it may be the name of a module within a package. The
{path} parameter should be a sequence, and is used to augment the value
of ``sys.path``, which is used to locate module source code.
readmodule_ex(module[, path=None])~
Like readmodule, but the returned dictionary, in addition to
mapping class names to class descriptor objects, also maps top-level
function names to function descriptor objects. Moreover, if the module
being read is a package, the key ``'__path__'`` in the returned
dictionary has as its value a list which contains the package search
path.
Class Objects
-------------
The Class objects used as values in the dictionary returned by
readmodule and readmodule_ex provide the following data
members:
Class.module~
The name of the module defining the class described by the class descriptor.
Class.name~
The name of the class.
Class.super~
A list of Class objects which describe the immediate base
classes of the class being described. Classes which are named as
superclasses but which are not discoverable by readmodule are
listed as a string with the class name instead of as Class
objects.
Class.methods~
A dictionary mapping method names to line numbers.
Class.file~
Name of the file containing the ``class`` statement defining the class.
Class.lineno~
The line number of the ``class`` statement within the file named by
Class.file.
Function Objects
----------------
The Function objects used as values in the dictionary returned by
readmodule_ex provide the following data members:
Function.module~
The name of the module defining the function described by the function
descriptor.
Function.name~
The name of the function.
Function.file~
Name of the file containing the ``def`` statement defining the function.
Function.lineno~
The line number of the ``def`` statement within the file named by
Function.file.
==============================================================================
*py2stdlib-pydoc*
pydoc~
:synopsis: Documentation generator and online help system.
.. versionadded:: 2.1
.. index::
single: documentation; generation
single: documentation; online
single: help; online
The pydoc (|py2stdlib-pydoc|) module automatically generates documentation from Python
modules. The documentation can be presented as pages of text on the console,
served to a Web browser, or saved to HTML files.
The built-in function help invokes the online help system in the
interactive interpreter, which uses pydoc (|py2stdlib-pydoc|) to generate its documentation
as text on the console. The same text documentation can also be viewed from
outside the Python interpreter by running pydoc (|py2stdlib-pydoc|) as a script at the
operating system's command prompt. For example, running :: >
pydoc sys
<
at a shell prompt will display documentation on the sys (|py2stdlib-sys|) module, in a
style similar to the manual pages shown by the Unix man command. The
argument to pydoc (|py2stdlib-pydoc|) can be the name of a function, module, or package,
or a dotted reference to a class, method, or function within a module or module
in a package. If the argument to pydoc (|py2stdlib-pydoc|) looks like a path (that is,
it contains the path separator for your operating system, such as a slash in
Unix), and refers to an existing Python source file, then documentation is
produced for that file.
.. note::
In order to find objects and their documentation, pydoc (|py2stdlib-pydoc|) imports the
module(s) to be documented. Therefore, any code on module level will be
executed on that occasion. Use an ``if __name__ == '__main__':`` guard to
only execute code when a file is invoked as a script and not just imported.
Specifying a -w flag before the argument will cause HTML documentation
to be written out to a file in the current directory, instead of displaying text
on the console.
Specifying a -k flag before the argument will search the synopsis
lines of all available modules for the keyword given as the argument, again in a
manner similar to the Unix man command. The synopsis line of a
module is the first line of its documentation string.
You can also use pydoc (|py2stdlib-pydoc|) to start an HTTP server on the local machine
that will serve documentation to visiting Web browsers. pydoc (|py2stdlib-pydoc|)
-p 1234 will start a HTTP server on port 1234, allowing you to browse
the documentation at ``http://localhost:1234/`` in your preferred Web browser.
pydoc (|py2stdlib-pydoc|) -g will start the server and additionally bring up a
small Tkinter (|py2stdlib-tkinter|)\ -based graphical interface to help you search for
documentation pages.
When pydoc (|py2stdlib-pydoc|) generates documentation, it uses the current environment
and path to locate modules. Thus, invoking pydoc (|py2stdlib-pydoc|) spam
documents precisely the version of the module you would get if you started the
Python interpreter and typed ``import spam``.
Module docs for core modules are assumed to reside in
http://docs.python.org/library/. This can be overridden by setting the
PYTHONDOCS environment variable to a different URL or to a local
directory containing the Library Reference Manual pages.
==============================================================================
*py2stdlib-pixmapwrapper*
PixMapWrapper~
:platform: Mac
:synopsis: Wrapper for PixMap objects.
:deprecated:
PixMapWrapper (|py2stdlib-pixmapwrapper|) wraps a PixMap object with a Python object that allows
access to the fields by name. It also has methods to convert to and from
PIL images.
2.6~
videoreader (|py2stdlib-videoreader|) --- Read QuickTime movies
--------------------------------------------
==============================================================================
*py2stdlib-queue*
Queue~
:synopsis: A synchronized queue class.
.. note::
The Queue (|py2stdlib-queue|) module has been renamed to queue in Python 3.0. The
2to3 tool will automatically adapt imports when converting your
sources to 3.0.
The Queue (|py2stdlib-queue|) module implements multi-producer, multi-consumer queues.
It is especially useful in threaded programming when information must be
exchanged safely between multiple threads. The Queue (|py2stdlib-queue|) class in this
module implements all the required locking semantics. It depends on the
availability of thread support in Python; see the threading (|py2stdlib-threading|)
module.
Implements three types of queue whose only difference is the order that
the entries are retrieved. In a FIFO queue, the first tasks added are
the first retrieved. In a LIFO queue, the most recently added entry is
the first retrieved (operating like a stack). With a priority queue,
the entries are kept sorted (using the heapq (|py2stdlib-heapq|) module) and the
lowest valued entry is retrieved first.
The Queue (|py2stdlib-queue|) module defines the following classes and exceptions:
Queue(maxsize=0)~
Constructor for a FIFO queue. {maxsize} is an integer that sets the upperbound
limit on the number of items that can be placed in the queue. Insertion will
block once this size has been reached, until queue items are consumed. If
{maxsize} is less than or equal to zero, the queue size is infinite.
LifoQueue(maxsize=0)~
Constructor for a LIFO queue. {maxsize} is an integer that sets the upperbound
limit on the number of items that can be placed in the queue. Insertion will
block once this size has been reached, until queue items are consumed. If
{maxsize} is less than or equal to zero, the queue size is infinite.
.. versionadded:: 2.6
PriorityQueue(maxsize=0)~
Constructor for a priority queue. {maxsize} is an integer that sets the upperbound
limit on the number of items that can be placed in the queue. Insertion will
block once this size has been reached, until queue items are consumed. If
{maxsize} is less than or equal to zero, the queue size is infinite.
The lowest valued entries are retrieved first (the lowest valued entry is the
one returned by ``sorted(list(entries))[0]``). A typical pattern for entries
is a tuple in the form: ``(priority_number, data)``.
.. versionadded:: 2.6
Empty~
Exception raised when non-blocking get (or get_nowait) is called
on a Queue (|py2stdlib-queue|) object which is empty.
Full~
Exception raised when non-blocking put (or put_nowait) is called
on a Queue (|py2stdlib-queue|) object which is full.
.. seealso::
collections.deque is an alternative implementation of unbounded
queues with fast atomic append and popleft operations that
do not require locking.
Queue Objects
-------------
Queue objects (Queue (|py2stdlib-queue|), LifoQueue, or PriorityQueue)
provide the public methods described below.
Queue.qsize()~
Return the approximate size of the queue. Note, qsize() > 0 doesn't
guarantee that a subsequent get() will not block, nor will qsize() < maxsize
guarantee that put() will not block.
Queue.empty()~
Return ``True`` if the queue is empty, ``False`` otherwise. If empty()
returns ``True`` it doesn't guarantee that a subsequent call to put()
will not block. Similarly, if empty() returns ``False`` it doesn't
guarantee that a subsequent call to get() will not block.
Queue.full()~
Return ``True`` if the queue is full, ``False`` otherwise. If full()
returns ``True`` it doesn't guarantee that a subsequent call to get()
will not block. Similarly, if full() returns ``False`` it doesn't
guarantee that a subsequent call to put() will not block.
Queue.put(item[, block[, timeout]])~
Put {item} into the queue. If optional args {block} is true and {timeout} is
None (the default), block if necessary until a free slot is available. If
{timeout} is a positive number, it blocks at most {timeout} seconds and raises
the Full exception if no free slot was available within that time.
Otherwise ({block} is false), put an item on the queue if a free slot is
immediately available, else raise the Full exception ({timeout} is
ignored in that case).
.. versionadded:: 2.3
The {timeout} parameter.
Queue.put_nowait(item)~
Equivalent to ``put(item, False)``.
Queue.get([block[, timeout]])~
Remove and return an item from the queue. If optional args {block} is true and
{timeout} is None (the default), block if necessary until an item is available.
If {timeout} is a positive number, it blocks at most {timeout} seconds and
raises the Empty exception if no item was available within that time.
Otherwise ({block} is false), return an item if one is immediately available,
else raise the Empty exception ({timeout} is ignored in that case).
.. versionadded:: 2.3
The {timeout} parameter.
Queue.get_nowait()~
Equivalent to ``get(False)``.
Two methods are offered to support tracking whether enqueued tasks have been
fully processed by daemon consumer threads.
Queue.task_done()~
Indicate that a formerly enqueued task is complete. Used by queue consumer
threads. For each get used to fetch a task, a subsequent call to
task_done tells the queue that the processing on the task is complete.
If a join is currently blocking, it will resume when all items have been
processed (meaning that a task_done call was received for every item
that had been put into the queue).
Raises a ValueError if called more times than there were items placed in
the queue.
.. versionadded:: 2.5
Queue.join()~
Blocks until all items in the queue have been gotten and processed.
The count of unfinished tasks goes up whenever an item is added to the queue.
The count goes down whenever a consumer thread calls task_done to
indicate that the item was retrieved and all work on it is complete. When the
count of unfinished tasks drops to zero, join unblocks.
.. versionadded:: 2.5
Example of how to wait for enqueued tasks to be completed:: >
def worker():
while True:
item = q.get()
do_work(item)
q.task_done()
q = Queue()
for i in range(num_worker_threads):
t = Thread(target=worker)
t.daemon = True
t.start()
for item in source():
q.put(item)
q.join() # block until all tasks are done
==============================================================================
*py2stdlib-quopri*
quopri~
:synopsis: Encode and decode files using the MIME quoted-printable encoding.
.. index::
pair: quoted-printable; encoding
single: MIME; quoted-printable encoding
This module performs quoted-printable transport encoding and decoding, as
defined in 1521: "MIME (Multipurpose Internet Mail Extensions) Part One:
Mechanisms for Specifying and Describing the Format of Internet Message Bodies".
The quoted-printable encoding is designed for data where there are relatively
few nonprintable characters; the base64 encoding scheme available via the
base64 (|py2stdlib-base64|) module is more compact if there are many such characters, as when
sending a graphics file.
decode(input, output[,header])~
Decode the contents of the {input} file and write the resulting decoded binary
data to the {output} file. {input} and {output} must either be file objects or
objects that mimic the file object interface. {input} will be read until
``input.readline()`` returns an empty string. If the optional argument {header}
is present and true, underscore will be decoded as space. This is used to decode
"Q"-encoded headers as described in 1522: "MIME (Multipurpose Internet
Mail Extensions) Part Two: Message Header Extensions for Non-ASCII Text".
encode(input, output, quotetabs)~
Encode the contents of the {input} file and write the resulting quoted-printable
data to the {output} file. {input} and {output} must either be file objects or
objects that mimic the file object interface. {input} will be read until
``input.readline()`` returns an empty string. {quotetabs} is a flag which
controls whether to encode embedded spaces and tabs; when true it encodes such
embedded whitespace, and when false it leaves them unencoded. Note that spaces
and tabs appearing at the end of lines are always encoded, as per 1521.
decodestring(s[,header])~
Like decode, except that it accepts a source string and returns the
corresponding decoded string.
encodestring(s[, quotetabs])~
Like encode, except that it accepts a source string and returns the
corresponding encoded string. {quotetabs} is optional (defaulting to 0), and is
passed straight through to encode.
.. seealso::
Module mimify (|py2stdlib-mimify|)
General utilities for processing of MIME messages.
Module base64 (|py2stdlib-base64|)
Encode and decode MIME base64 data
==============================================================================
*py2stdlib-random*
random~
:synopsis: Generate pseudo-random numbers with various common distributions.
This module implements pseudo-random number generators for various
distributions.
For integers, uniform selection from a range. For sequences, uniform selection
of a random element, a function to generate a random permutation of a list
in-place, and a function for random sampling without replacement.
On the real line, there are functions to compute uniform, normal (Gaussian),
lognormal, negative exponential, gamma, and beta distributions. For generating
distributions of angles, the von Mises distribution is available.
Almost all module functions depend on the basic function random (|py2stdlib-random|), which
generates a random float uniformly in the semi-open range [0.0, 1.0). Python
uses the Mersenne Twister as the core generator. It produces 53-bit precision
floats and has a period of 2\{\}19937-1. The underlying implementation in C is
both fast and threadsafe. The Mersenne Twister is one of the most extensively
tested random number generators in existence. However, being completely
deterministic, it is not suitable for all purposes, and is completely unsuitable
for cryptographic purposes.
The functions supplied by this module are actually bound methods of a hidden
instance of the random.Random class. You can instantiate your own
instances of Random to get generators that don't share state. This is
especially useful for multi-threaded programs, creating a different instance of
Random for each thread, and using the jumpahead method to make
it likely that the generated sequences seen by each thread don't overlap.
Class Random can also be subclassed if you want to use a different
basic generator of your own devising: in that case, override the random (|py2stdlib-random|),
seed, getstate, setstate and jumpahead methods.
Optionally, a new generator can supply a getrandbits method --- this
allows randrange to produce selections over an arbitrarily large range.
.. versionadded:: 2.4
the getrandbits method.
As an example of subclassing, the random (|py2stdlib-random|) module provides the
WichmannHill class that implements an alternative generator in pure
Python. The class provides a backward compatible way to reproduce results from
earlier versions of Python, which used the Wichmann-Hill algorithm as the core
generator. Note that this Wichmann-Hill generator can no longer be recommended:
its period is too short by contemporary standards, and the sequence generated is
known to fail some stringent randomness tests. See the references below for a
recent variant that repairs these flaws.
.. versionchanged:: 2.3
MersenneTwister replaced Wichmann-Hill as the default generator.
The random (|py2stdlib-random|) module also provides the SystemRandom class which
uses the system function os.urandom to generate random numbers
from sources provided by the operating system.
Bookkeeping functions:
seed([x])~
Initialize the basic random number generator. Optional argument {x} can be any
hashable object. If {x} is omitted or ``None``, current system time is used;
current system time is also used to initialize the generator when the module is
first imported. If randomness sources are provided by the operating system,
they are used instead of the system time (see the os.urandom function
for details on availability).
.. versionchanged:: 2.4
formerly, operating system resources were not used.
If {x} is not ``None`` or an int or long, ``hash(x)`` is used instead. If {x} is
an int or long, {x} is used directly.
getstate()~
Return an object capturing the current internal state of the generator. This
object can be passed to setstate to restore the state.
.. versionadded:: 2.1
.. versionchanged:: 2.6
State values produced in Python 2.6 cannot be loaded into earlier versions.
setstate(state)~
{state} should have been obtained from a previous call to getstate, and
setstate restores the internal state of the generator to what it was at
the time setstate was called.
.. versionadded:: 2.1
jumpahead(n)~
Change the internal state to one different from and likely far away from the
current state. {n} is a non-negative integer which is used to scramble the
current state vector. This is most useful in multi-threaded programs, in
conjunction with multiple instances of the Random class:
setstate or seed can be used to force all instances into the
same internal state, and then jumpahead can be used to force the
instances' states far apart.
.. versionadded:: 2.1
.. versionchanged:: 2.3
Instead of jumping to a specific state, {n} steps ahead, ``jumpahead(n)``
jumps to another state likely to be separated by many steps.
getrandbits(k)~
Returns a python long int with {k} random bits. This method is supplied
with the MersenneTwister generator and some other generators may also provide it
as an optional part of the API. When available, getrandbits enables
randrange to handle arbitrarily large ranges.
.. versionadded:: 2.4
Functions for integers:
randrange([start,] stop[, step])~
Return a randomly selected element from ``range(start, stop, step)``. This is
equivalent to ``choice(range(start, stop, step))``, but doesn't actually build a
range object.
.. versionadded:: 1.5.2
randint(a, b)~
Return a random integer {N} such that ``a <= N <= b``.
Functions for sequences:
choice(seq)~
Return a random element from the non-empty sequence {seq}. If {seq} is empty,
raises IndexError.
shuffle(x[, random])~
Shuffle the sequence {x} in place. The optional argument {random} is a
0-argument function returning a random float in [0.0, 1.0); by default, this is
the function random (|py2stdlib-random|).
Note that for even rather small ``len(x)``, the total number of permutations of
{x} is larger than the period of most random number generators; this implies
that most permutations of a long sequence can never be generated.
sample(population, k)~
Return a {k} length list of unique elements chosen from the population sequence.
Used for random sampling without replacement.
.. versionadded:: 2.3
Returns a new list containing elements from the population while leaving the
original population unchanged. The resulting list is in selection order so that
all sub-slices will also be valid random samples. This allows raffle winners
(the sample) to be partitioned into grand prize and second place winners (the
subslices).
Members of the population need not be hashable or unique. If the population
contains repeats, then each occurrence is a possible selection in the sample.
To choose a sample from a range of integers, use an xrange object as an
argument. This is especially fast and space efficient for sampling from a large
population: ``sample(xrange(10000000), 60)``.
The following functions generate specific real-valued distributions. Function
parameters are named after the corresponding variables in the distribution's
equation, as used in common mathematical practice; most of these equations can
be found in any statistics text.
random()~
Return the next random floating point number in the range [0.0, 1.0).
uniform(a, b)~
Return a random floating point number {N} such that ``a <= N <= b`` for
``a <= b`` and ``b <= N <= a`` for ``b < a``.
The end-point value ``b`` may or may not be included in the range
depending on floating-point rounding in the equation ``a + (b-a) * random()``.
triangular(low, high, mode)~
Return a random floating point number {N} such that ``low <= N <= high`` and
with the specified {mode} between those bounds. The {low} and {high} bounds
default to zero and one. The {mode} argument defaults to the midpoint
between the bounds, giving a symmetric distribution.
.. versionadded:: 2.6
betavariate(alpha, beta)~
Beta distribution. Conditions on the parameters are ``alpha > 0`` and
``beta > 0``. Returned values range between 0 and 1.
expovariate(lambd)~
Exponential distribution. {lambd} is 1.0 divided by the desired
mean. It should be nonzero. (The parameter would be called
"lambda", but that is a reserved word in Python.) Returned values
range from 0 to positive infinity if {lambd} is positive, and from
negative infinity to 0 if {lambd} is negative.
gammavariate(alpha, beta)~
Gamma distribution. ({Not} the gamma function!) Conditions on the
parameters are ``alpha > 0`` and ``beta > 0``.
gauss(mu, sigma)~
Gaussian distribution. {mu} is the mean, and {sigma} is the standard
deviation. This is slightly faster than the normalvariate function
defined below.
lognormvariate(mu, sigma)~
Log normal distribution. If you take the natural logarithm of this
distribution, you'll get a normal distribution with mean {mu} and standard
deviation {sigma}. {mu} can have any value, and {sigma} must be greater than
zero.
normalvariate(mu, sigma)~
Normal distribution. {mu} is the mean, and {sigma} is the standard deviation.
vonmisesvariate(mu, kappa)~
{mu} is the mean angle, expressed in radians between 0 and 2\{\ }pi{, and }kappa*
is the concentration parameter, which must be greater than or equal to zero. If
{kappa} is equal to zero, this distribution reduces to a uniform random angle
over the range 0 to 2\{\ }pi*.
paretovariate(alpha)~
Pareto distribution. {alpha} is the shape parameter.
weibullvariate(alpha, beta)~
Weibull distribution. {alpha} is the scale parameter and {beta} is the shape
parameter.
Alternative Generators:
WichmannHill([seed])~
Class that implements the Wichmann-Hill algorithm as the core generator. Has all
of the same methods as Random plus the whseed method described
below. Because this class is implemented in pure Python, it is not threadsafe
and may require locks between calls. The period of the generator is
6,953,607,871,644 which is small enough to require care that two independent
random sequences do not overlap.
whseed([x])~
This is obsolete, supplied for bit-level compatibility with versions of Python
prior to 2.1. See seed for details. whseed does not guarantee
that distinct integer arguments yield distinct internal states, and can yield no
more than about 2\{\}24 distinct internal states in all.
SystemRandom([seed])~
Class that uses the os.urandom function for generating random numbers
from sources provided by the operating system. Not available on all systems.
Does not rely on software state and sequences are not reproducible. Accordingly,
the seed and jumpahead methods have no effect and are ignored.
The getstate and setstate methods raise
NotImplementedError if called.
.. versionadded:: 2.4
Examples of basic usage:: >
>>> random.random() # Random float x, 0.0 <= x < 1.0
0.37444887175646646
>>> random.uniform(1, 10) # Random float x, 1.0 <= x < 10.0
1.1800146073117523
>>> random.randint(1, 10) # Integer from 1 to 10, endpoints included
7
>>> random.randrange(0, 101, 2) # Even integer from 0 to 100
26
>>> random.choice('abcdefghij') # Choose a random element
'c'
>>> items = [1, 2, 3, 4, 5, 6, 7]
>>> random.shuffle(items)
>>> items
[7, 3, 2, 5, 6, 4, 1]
>>> random.sample([1, 2, 3, 4, 5], 3) # Choose 3 elements
[4, 1, 5]
<
.. seealso::
M. Matsumoto and T. Nishimura, "Mersenne Twister: A 623-dimensionally
equidistributed uniform pseudorandom number generator", ACM Transactions on
Modeling and Computer Simulation Vol. 8, No. 1, January pp.3-30 1998.
Wichmann, B. A. & Hill, I. D., "Algorithm AS 183: An efficient and portable
pseudo-random number generator", Applied Statistics 31 (1982) 188-190.
`Complementary-Multiply-with-Carry recipe
<http://code.activestate.com/recipes/576707/>`_ for a compatible alternative
random number generator with a long period and comparatively simple update
operations.
==============================================================================
*py2stdlib-re*
re~
:synopsis: Regular expression operations.
This module provides regular expression matching operations similar to
those found in Perl. Both patterns and strings to be searched can be
Unicode strings as well as 8-bit strings.
Regular expressions use the backslash character (``'\'``) to indicate
special forms or to allow special characters to be used without invoking
their special meaning. This collides with Python's usage of the same
character for the same purpose in string literals; for example, to match
a literal backslash, one might have to write ``'\\\\'`` as the pattern
string, because the regular expression must be ``\\``, and each
backslash must be expressed as ``\\`` inside a regular Python string
literal.
The solution is to use Python's raw string notation for regular expression
patterns; backslashes are not handled in any special way in a string literal
prefixed with ``'r'``. So ``r"\n"`` is a two-character string containing
``'\'`` and ``'n'``, while ``"\n"`` is a one-character string containing a
newline. Usually patterns will be expressed in Python code using this raw
string notation.
It is important to note that most regular expression operations are available as
module-level functions and RegexObject methods. The functions are
shortcuts that don't require you to compile a regex object first, but miss some
fine-tuning parameters.
.. seealso::
Mastering Regular Expressions
Book on regular expressions by Jeffrey Friedl, published by O'Reilly. The
second edition of the book no longer covers Python at all, but the first
edition covered writing good regular expression patterns in great detail.
Regular Expression Syntax
-------------------------
A regular expression (or RE) specifies a set of strings that matches it; the
functions in this module let you check if a particular string matches a given
regular expression (or if a given regular expression matches a particular
string, which comes down to the same thing).
Regular expressions can be concatenated to form new regular expressions; if {A}
and {B} are both regular expressions, then {AB} is also a regular expression.
In general, if a string {p} matches {A} and another string {q} matches {B}, the
string {pq} will match AB. This holds unless {A} or {B} contain low precedence
operations; boundary conditions between {A} and {B}; or have numbered group
references. Thus, complex expressions can easily be constructed from simpler
primitive expressions like the ones described here. For details of the theory
and implementation of regular expressions, consult the Friedl book referenced
above, or almost any textbook about compiler construction.
A brief explanation of the format of regular expressions follows. For further
information and a gentler presentation, consult the regex-howto.
Regular expressions can contain both special and ordinary characters. Most
ordinary characters, like ``'A'``, ``'a'``, or ``'0'``, are the simplest regular
expressions; they simply match themselves. You can concatenate ordinary
characters, so ``last`` matches the string ``'last'``. (In the rest of this
section, we'll write RE's in ``this special style``, usually without quotes, and
strings to be matched ``'in single quotes'``.)
Some characters, like ``'|'`` or ``'('``, are special. Special
characters either stand for classes of ordinary characters, or affect
how the regular expressions around them are interpreted. Regular
expression pattern strings may not contain null bytes, but can specify
the null byte using the ``\number`` notation, e.g., ``'\x00'``.
The special characters are:
``'.'``
(Dot.) In the default mode, this matches any character except a newline. If
the DOTALL flag has been specified, this matches any character
including a newline.
``'^'``
(Caret.) Matches the start of the string, and in MULTILINE mode also
matches immediately after each newline.
``'$'``
Matches the end of the string or just before the newline at the end of the
string, and in MULTILINE mode also matches before a newline. ``foo``
matches both 'foo' and 'foobar', while the regular expression ``foo$`` matches
only 'foo'. More interestingly, searching for ``foo.$`` in ``'foo1\nfoo2\n'``
matches 'foo2' normally, but 'foo1' in MULTILINE mode; searching for
a single ``$`` in ``'foo\n'`` will find two (empty) matches: one just before
the newline, and one at the end of the string.
``'*'``
Causes the resulting RE to match 0 or more repetitions of the preceding RE, as
many repetitions as are possible. ``ab*`` will match 'a', 'ab', or 'a' followed
by any number of 'b's.
``'+'``
Causes the resulting RE to match 1 or more repetitions of the preceding RE.
``ab+`` will match 'a' followed by any non-zero number of 'b's; it will not
match just 'a'.
``'?'``
Causes the resulting RE to match 0 or 1 repetitions of the preceding RE.
``ab?`` will match either 'a' or 'ab'.
``*?``, ``+?``, ``??``
The ``'*'``, ``'+'``, and ``'?'`` qualifiers are all greedy; they match
as much text as possible. Sometimes this behaviour isn't desired; if the RE
``<.*>`` is matched against ``'<H1>title</H1>'``, it will match the entire
string, and not just ``'<H1>'``. Adding ``'?'`` after the qualifier makes it
perform the match in non-greedy or minimal fashion; as {few}
characters as possible will be matched. Using ``.*?`` in the previous
expression will match only ``'<H1>'``.
``{m}``
Specifies that exactly {m} copies of the previous RE should be matched; fewer
matches cause the entire RE not to match. For example, ``a{6}`` will match
exactly six ``'a'`` characters, but not five.
``{m,n}``
Causes the resulting RE to match from {m} to {n} repetitions of the preceding
RE, attempting to match as many repetitions as possible. For example,
``a{3,5}`` will match from 3 to 5 ``'a'`` characters. Omitting {m} specifies a
lower bound of zero, and omitting {n} specifies an infinite upper bound. As an
example, ``a{4,}b`` will match ``aaaab`` or a thousand ``'a'`` characters
followed by a ``b``, but not ``aaab``. The comma may not be omitted or the
modifier would be confused with the previously described form.
``{m,n}?``
Causes the resulting RE to match from {m} to {n} repetitions of the preceding
RE, attempting to match as {few} repetitions as possible. This is the
non-greedy version of the previous qualifier. For example, on the
6-character string ``'aaaaaa'``, ``a{3,5}`` will match 5 ``'a'`` characters,
while ``a{3,5}?`` will only match 3 characters.
``'\'``
Either escapes special characters (permitting you to match characters like
``'*'``, ``'?'``, and so forth), or signals a special sequence; special
sequences are discussed below.
If you're not using a raw string to express the pattern, remember that Python
also uses the backslash as an escape sequence in string literals; if the escape
sequence isn't recognized by Python's parser, the backslash and subsequent
character are included in the resulting string. However, if Python would
recognize the resulting sequence, the backslash should be repeated twice. This
is complicated and hard to understand, so it's highly recommended that you use
raw strings for all but the simplest expressions.
``[]``
Used to indicate a set of characters. Characters can be listed individually, or
a range of characters can be indicated by giving two characters and separating
them by a ``'-'``. Special characters are not active inside sets. For example,
``[akm$]`` will match any of the characters ``'a'``, ``'k'``,
``'m'``, or ``'$'``; ``[a-z]`` will match any lowercase letter, and
``[a-zA-Z0-9]`` matches any letter or digit. Character classes such
as ``\w`` or ``\S`` (defined below) are also acceptable inside a
range, although the characters they match depends on whether LOCALE
or UNICODE mode is in force. If you want to include a
``']'`` or a ``'-'`` inside a set, precede it with a backslash, or
place it as the first character. The pattern ``[]]`` will match
``']'``, for example.
You can match the characters not within a range by complementing the set.
This is indicated by including a ``'^'`` as the first character of the set;
``'^'`` elsewhere will simply match the ``'^'`` character. For example,
``[^5]`` will match any character except ``'5'``, and ``[^^]`` will match any
character except ``'^'``.
Note that inside ``[]`` the special forms and special characters lose
their meanings and only the syntaxes described here are valid. For
example, ``+``, ``*``, ``(``, ``)``, and so on are treated as
literals inside ``[]``, and backreferences cannot be used inside
``[]``.
``'|'``
``A|B``, where A and B can be arbitrary REs, creates a regular expression that
will match either A or B. An arbitrary number of REs can be separated by the
``'|'`` in this way. This can be used inside groups (see below) as well. As
the target string is scanned, REs separated by ``'|'`` are tried from left to
right. When one pattern completely matches, that branch is accepted. This means
that once ``A`` matches, ``B`` will not be tested further, even if it would
produce a longer overall match. In other words, the ``'|'`` operator is never
greedy. To match a literal ``'|'``, use ``\|``, or enclose it inside a
character class, as in ``[|]``.
``(...)``
Matches whatever regular expression is inside the parentheses, and indicates the
start and end of a group; the contents of a group can be retrieved after a match
has been performed, and can be matched later in the string with the ``\number``
special sequence, described below. To match the literals ``'('`` or ``')'``,
use ``\(`` or ``\)``, or enclose them inside a character class: ``[(] [)]``.
``(?...)``
This is an extension notation (a ``'?'`` following a ``'('`` is not meaningful
otherwise). The first character after the ``'?'`` determines what the meaning
and further syntax of the construct is. Extensions usually do not create a new
group; ``(?P<name>...)`` is the only exception to this rule. Following are the
currently supported extensions.
``(?iLmsux)``
(One or more letters from the set ``'i'``, ``'L'``, ``'m'``, ``'s'``,
``'u'``, ``'x'``.) The group matches the empty string; the letters
set the corresponding flags: re.I (ignore case),
re.L (locale dependent), re.M (multi-line),
re.S (dot matches all), re.U (Unicode dependent),
and re.X (verbose), for the entire regular expression. (The
flags are described in contents-of-module-re.) This
is useful if you wish to include the flags as part of the regular
expression, instead of passing a {flag} argument to the
re.compile function.
Note that the ``(?x)`` flag changes how the expression is parsed. It should be
used first in the expression string, or after one or more whitespace characters.
If there are non-whitespace characters before the flag, the results are
undefined.
``(?:...)``
A non-grouping version of regular parentheses. Matches whatever regular
expression is inside the parentheses, but the substring matched by the group
{cannot} be retrieved after performing a match or referenced later in the
pattern.
``(?P<name>...)``
Similar to regular parentheses, but the substring matched by the group is
accessible within the rest of the regular expression via the symbolic group
name {name}. Group names must be valid Python identifiers, and each group
name must be defined only once within a regular expression. A symbolic group
is also a numbered group, just as if the group were not named. So the group
named ``id`` in the example below can also be referenced as the numbered group
``1``.
For example, if the pattern is ``(?P<id>[a-zA-Z_]\w*)``, the group can be
referenced by its name in arguments to methods of match objects, such as
``m.group('id')`` or ``m.end('id')``, and also by name in the regular
expression itself (using ``(?P=id)``) and replacement text given to
``.sub()`` (using ``\g<id>``).
``(?P=name)``
Matches whatever text was matched by the earlier group named {name}.
``(?#...)``
A comment; the contents of the parentheses are simply ignored.
``(?=...)``
Matches if ``...`` matches next, but doesn't consume any of the string. This is
called a lookahead assertion. For example, ``Isaac (?=Asimov)`` will match
``'Isaac '`` only if it's followed by ``'Asimov'``.
``(?!...)``
Matches if ``...`` doesn't match next. This is a negative lookahead assertion.
For example, ``Isaac (?!Asimov)`` will match ``'Isaac '`` only if it's {not}
followed by ``'Asimov'``.
``(?<=...)``
Matches if the current position in the string is preceded by a match for ``...``
that ends at the current position. This is called a :dfn:`positive lookbehind
assertion`. ``(?<=abc)def`` will find a match in ``abcdef``, since the
lookbehind will back up 3 characters and check if the contained pattern matches.
The contained pattern must only match strings of some fixed length, meaning that
``abc`` or ``a|b`` are allowed, but ``a*`` and ``a{3,4}`` are not. Note that
patterns which start with positive lookbehind assertions will never match at the
beginning of the string being searched; you will most likely want to use the
search function rather than the match function:
>>> import re
>>> m = re.search('(?<=abc)def', 'abcdef')
>>> m.group(0)
'def'
This example looks for a word following a hyphen:
>>> m = re.search('(?<=-)\w+', 'spam-egg')
>>> m.group(0)
'egg'
``(?<!...)``
Matches if the current position in the string is not preceded by a match for
``...``. This is called a negative lookbehind assertion. Similar to
positive lookbehind assertions, the contained pattern must only match strings of
some fixed length. Patterns which start with negative lookbehind assertions may
match at the beginning of the string being searched.
``(?(id/name)yes-pattern|no-pattern)``
Will try to match with ``yes-pattern`` if the group with given {id} or {name}
exists, and with ``no-pattern`` if it doesn't. ``no-pattern`` is optional and
can be omitted. For example, ``(<)?(\w+@\w+(?:\.\w+)+)(?(1)>)`` is a poor email
matching pattern, which will match with ``'<user@host.com>'`` as well as
``'user@host.com'``, but not with ``'<user@host.com'``.
.. versionadded:: 2.4
The special sequences consist of ``'\'`` and a character from the list below.
If the ordinary character is not on the list, then the resulting RE will match
the second character. For example, ``\$`` matches the character ``'$'``.
``\number``
Matches the contents of the group of the same number. Groups are numbered
starting from 1. For example, ``(.+) \1`` matches ``'the the'`` or ``'55 55'``,
but not ``'the end'`` (note the space after the group). This special sequence
can only be used to match one of the first 99 groups. If the first digit of
{number} is 0, or {number} is 3 octal digits long, it will not be interpreted as
a group match, but as the character with octal value {number}. Inside the
``'['`` and ``']'`` of a character class, all numeric escapes are treated as
characters.
``\A``
Matches only at the start of the string.
``\b``
Matches the empty string, but only at the beginning or end of a word. A word is
defined as a sequence of alphanumeric or underscore characters, so the end of a
word is indicated by whitespace or a non-alphanumeric, non-underscore character.
Note that ``\b`` is defined as the boundary between ``\w`` and ``\W``, so the
precise set of characters deemed to be alphanumeric depends on the values of the
``UNICODE`` and ``LOCALE`` flags. Inside a character range, ``\b`` represents
the backspace character, for compatibility with Python's string literals.
``\B``
Matches the empty string, but only when it is {not} at the beginning or end of a
word. This is just the opposite of ``\b``, so is also subject to the settings
of ``LOCALE`` and ``UNICODE``.
``\d``
When the UNICODE flag is not specified, matches any decimal digit; this
is equivalent to the set ``[0-9]``. With UNICODE, it will match
whatever is classified as a decimal digit in the Unicode character properties
database.
``\D``
When the UNICODE flag is not specified, matches any non-digit
character; this is equivalent to the set ``[^0-9]``. With UNICODE, it
will match anything other than character marked as digits in the Unicode
character properties database.
``\s``
When the LOCALE and UNICODE flags are not specified, matches
any whitespace character; this is equivalent to the set ``[ \t\n\r\f\v]``. With
LOCALE, it will match this set plus whatever characters are defined as
space for the current locale. If UNICODE is set, this will match the
characters ``[ \t\n\r\f\v]`` plus whatever is classified as space in the Unicode
character properties database.
``\S``
When the LOCALE and UNICODE flags are not specified, matches
any non-whitespace character; this is equivalent to the set ``[^ \t\n\r\f\v]``
With LOCALE, it will match any character not in this set, and not
defined as space in the current locale. If UNICODE is set, this will
match anything other than ``[ \t\n\r\f\v]`` and characters marked as space in
the Unicode character properties database.
``\w``
When the LOCALE and UNICODE flags are not specified, matches
any alphanumeric character and the underscore; this is equivalent to the set
``[a-zA-Z0-9_]``. With LOCALE, it will match the set ``[0-9_]`` plus
whatever characters are defined as alphanumeric for the current locale. If
UNICODE is set, this will match the characters ``[0-9_]`` plus whatever
is classified as alphanumeric in the Unicode character properties database.
``\W``
When the LOCALE and UNICODE flags are not specified, matches
any non-alphanumeric character; this is equivalent to the set ``[^a-zA-Z0-9_]``.
With LOCALE, it will match any character not in the set ``[0-9_]``, and
not defined as alphanumeric for the current locale. If UNICODE is set,
this will match anything other than ``[0-9_]`` and characters marked as
alphanumeric in the Unicode character properties database.
``\Z``
Matches only at the end of the string.
Most of the standard escapes supported by Python string literals are also
accepted by the regular expression parser:: >
\a \b \f \n
\r \t \v \x
\\
<
Octal escapes are included in a limited form: If the first digit is a 0, or if
there are three octal digits, it is considered an octal escape. Otherwise, it is
a group reference. As for string literals, octal escapes are always at most
three digits in length.
Matching vs Searching
---------------------
Python offers two different primitive operations based on regular expressions:
{match}* checks for a match only at the beginning of the string, while
{search}* checks for a match anywhere in the string (this is what Perl does
by default).
Note that match may differ from search even when using a regular expression
beginning with ``'^'``: ``'^'`` matches only at the start of the string, or in
MULTILINE mode also immediately following a newline. The "match"
operation succeeds only if the pattern matches at the start of the string
regardless of mode, or at the starting position given by the optional {pos}
argument regardless of whether a newline precedes it.
>>> re.match("c", "abcdef") # No match
>>> re.search("c", "abcdef") # Match
<_sre.SRE_Match object at ...>
Module Contents
---------------
The module defines several functions, constants, and an exception. Some of the
functions are simplified versions of the full featured methods for compiled
regular expressions. Most non-trivial applications always use the compiled
form.
compile(pattern[, flags])~
Compile a regular expression pattern into a regular expression object, which
can be used for matching using its match and search methods,
described below.
The expression's behaviour can be modified by specifying a {flags} value.
Values can be any of the following variables, combined using bitwise OR (the
``|`` operator).
The sequence :: >
prog = re.compile(pattern)
result = prog.match(string)
<
is equivalent to ::
result = re.match(pattern, string)
but using re.compile and saving the resulting regular expression
object for reuse is more efficient when the expression will be used several
times in a single program.
.. note:: >
The compiled versions of the most recent patterns passed to
re.match, re.search or re.compile are cached, so
programs that use only a few regular expressions at a time needn't worry
about compiling regular expressions.
<
I~
IGNORECASE
Perform case-insensitive matching; expressions like ``[A-Z]`` will match
lowercase letters, too. This is not affected by the current locale.
L~
LOCALE
Make ``\w``, ``\W``, ``\b``, ``\B``, ``\s`` and ``\S`` dependent on the
current locale.
M~
MULTILINE
When specified, the pattern character ``'^'`` matches at the beginning of the
string and at the beginning of each line (immediately following each newline);
and the pattern character ``'$'`` matches at the end of the string and at the
end of each line (immediately preceding each newline). By default, ``'^'``
matches only at the beginning of the string, and ``'$'`` only at the end of the
string and immediately before the newline (if any) at the end of the string.
S~
DOTALL
Make the ``'.'`` special character match any character at all, including a
newline; without this flag, ``'.'`` will match anything {except} a newline.
U~
UNICODE
Make ``\w``, ``\W``, ``\b``, ``\B``, ``\d``, ``\D``, ``\s`` and ``\S`` dependent
on the Unicode character properties database.
.. versionadded:: 2.0
X~
VERBOSE
This flag allows you to write regular expressions that look nicer. Whitespace
within the pattern is ignored, except when in a character class or preceded by
an unescaped backslash, and, when a line contains a ``'#'`` neither in a
character class or preceded by an unescaped backslash, all characters from the
leftmost such ``'#'`` through the end of the line are ignored.
That means that the two following regular expression objects that match a
decimal number are functionally equal:: >
a = re.compile(r"""\d + # the integral part
\. # the decimal point
\d * # some fractional digits""", re.X)
b = re.compile(r"\d+\.\d*")
<
search(pattern, string[, flags])~
Scan through {string} looking for a location where the regular expression
{pattern} produces a match, and return a corresponding MatchObject
instance. Return ``None`` if no position in the string matches the pattern; note
that this is different from finding a zero-length match at some point in the
string.
match(pattern, string[, flags])~
If zero or more characters at the beginning of {string} match the regular
expression {pattern}, return a corresponding MatchObject instance.
Return ``None`` if the string does not match the pattern; note that this is
different from a zero-length match.
.. note:: >
If you want to locate a match anywhere in {string}, use search
instead.
<
split(pattern, string[, maxsplit=0, flags=0])~
Split {string} by the occurrences of {pattern}. If capturing parentheses are
used in {pattern}, then the text of all groups in the pattern are also returned
as part of the resulting list. If {maxsplit} is nonzero, at most {maxsplit}
splits occur, and the remainder of the string is returned as the final element
of the list. (Incompatibility note: in the original Python 1.5 release,
{maxsplit} was ignored. This has been fixed in later releases.)
>>> re.split('\W+', 'Words, words, words.')
['Words', 'words', 'words', '']
>>> re.split('(\W+)', 'Words, words, words.')
['Words', ', ', 'words', ', ', 'words', '.', '']
>>> re.split('\W+', 'Words, words, words.', 1)
['Words', 'words, words.']
>>> re.split('[a-f]+', '0a3B9', flags=re.IGNORECASE)
['0', '3', '9']
If there are capturing groups in the separator and it matches at the start of
the string, the result will start with an empty string. The same holds for
the end of the string:
>>> re.split('(\W+)', '...words, words...')
['', '...', 'words', ', ', 'words', '...', '']
That way, separator components are always found at the same relative
indices within the result list (e.g., if there's one capturing group
in the separator, the 0th, the 2nd and so forth).
Note that {split} will never split a string on an empty pattern match.
For example:
>>> re.split('x*', 'foo')
['foo']
>>> re.split("(?m)^$", "foo\n\nbar\n")
['foo\n\nbar\n']
.. versionchanged:: 2.7,3.1
Added the optional flags argument.
findall(pattern, string[, flags])~
Return all non-overlapping matches of {pattern} in {string}, as a list of
strings. The {string} is scanned left-to-right, and matches are returned in
the order found. If one or more groups are present in the pattern, return a
list of groups; this will be a list of tuples if the pattern has more than
one group. Empty matches are included in the result unless they touch the
beginning of another match.
.. versionadded:: 1.5.2
.. versionchanged:: 2.4
Added the optional flags argument.
finditer(pattern, string[, flags])~
Return an iterator yielding MatchObject instances over all
non-overlapping matches for the RE {pattern} in {string}. The {string} is
scanned left-to-right, and matches are returned in the order found. Empty
matches are included in the result unless they touch the beginning of another
match.
.. versionadded:: 2.2
.. versionchanged:: 2.4
Added the optional flags argument.
sub(pattern, repl, string[, count, flags])~
Return the string obtained by replacing the leftmost non-overlapping occurrences
of {pattern} in {string} by the replacement {repl}. If the pattern isn't found,
{string} is returned unchanged. {repl} can be a string or a function; if it is
a string, any backslash escapes in it are processed. That is, ``\n`` is
converted to a single newline character, ``\r`` is converted to a linefeed, and
so forth. Unknown escapes such as ``\j`` are left alone. Backreferences, such
as ``\6``, are replaced with the substring matched by group 6 in the pattern.
For example:
>>> re.sub(r'def\s+([a-zA-Z_][a-zA-Z_0-9]{)\s}\(\s*\):',
... r'static PyObject*\npy_\1(void)\n{',
... 'def myfunc():')
'static PyObject*\npy_myfunc(void)\n{'
If {repl} is a function, it is called for every non-overlapping occurrence of
{pattern}. The function takes a single match object argument, and returns the
replacement string. For example:
>>> def dashrepl(matchobj):
... if matchobj.group(0) == '-': return ' '
... else: return '-'
>>> re.sub('-{1,2}', dashrepl, 'pro----gram-files')
'pro--gram files'
>>> re.sub(r'\sAND\s', ' & ', 'Baked Beans And Spam', flags=re.IGNORECASE)
'Baked Beans & Spam'
The pattern may be a string or an RE object.
The optional argument {count} is the maximum number of pattern occurrences to be
replaced; {count} must be a non-negative integer. If omitted or zero, all
occurrences will be replaced. Empty matches for the pattern are replaced only
when not adjacent to a previous match, so ``sub('x*', '-', 'abc')`` returns
``'-a-b-c-'``.
In addition to character escapes and backreferences as described above,
``\g<name>`` will use the substring matched by the group named ``name``, as
defined by the ``(?P<name>...)`` syntax. ``\g<number>`` uses the corresponding
group number; ``\g<2>`` is therefore equivalent to ``\2``, but isn't ambiguous
in a replacement such as ``\g<2>0``. ``\20`` would be interpreted as a
reference to group 20, not a reference to group 2 followed by the literal
character ``'0'``. The backreference ``\g<0>`` substitutes in the entire
substring matched by the RE.
.. versionchanged:: 2.7,3.1
Added the optional flags argument.
subn(pattern, repl, string[, count, flags])~
Perform the same operation as sub, but return a tuple ``(new_string,
number_of_subs_made)``.
.. versionchanged:: 2.7,3.1
Added the optional flags argument.
escape(string)~
Return {string} with all non-alphanumerics backslashed; this is useful if you
want to match an arbitrary literal string that may have regular expression
metacharacters in it.
error~
Exception raised when a string passed to one of the functions here is not a
valid regular expression (for example, it might contain unmatched parentheses)
or when some other error occurs during compilation or matching. It is never an
error if a string contains no match for a pattern.
Regular Expression Objects
--------------------------
RegexObject~
The RegexObject class supports the following methods and attributes:
RegexObject.search(string[, pos[, endpos]])~
Scan through {string} looking for a location where this regular expression
produces a match, and return a corresponding MatchObject instance.
Return ``None`` if no position in the string matches the pattern; note that this
is different from finding a zero-length match at some point in the string.
The optional second parameter {pos} gives an index in the string where the
search is to start; it defaults to ``0``. This is not completely equivalent to
slicing the string; the ``'^'`` pattern character matches at the real beginning
of the string and at positions just after a newline, but not necessarily at the
index where the search is to start.
The optional parameter {endpos} limits how far the string will be searched; it
will be as if the string is {endpos} characters long, so only the characters
from {pos} to ``endpos - 1`` will be searched for a match. If {endpos} is less
than {pos}, no match will be found, otherwise, if {rx} is a compiled regular
expression object, ``rx.search(string, 0, 50)`` is equivalent to
``rx.search(string[:50], 0)``.
>>> pattern = re.compile("d")
>>> pattern.search("dog") # Match at index 0
<_sre.SRE_Match object at ...>
>>> pattern.search("dog", 1) # No match; search doesn't include the "d"
RegexObject.match(string[, pos[, endpos]])~
If zero or more characters at the {beginning} of {string} match this regular
expression, return a corresponding MatchObject instance. Return
``None`` if the string does not match the pattern; note that this is different
from a zero-length match.
The optional {pos} and {endpos} parameters have the same meaning as for the
RegexObject.search method.
.. note:: >
If you want to locate a match anywhere in {string}, use
RegexObject.search instead.
<
>>> pattern = re.compile("o")
>>> pattern.match("dog") # No match as "o" is not at the start of "dog".
>>> pattern.match("dog", 1) # Match as "o" is the 2nd character of "dog".
<_sre.SRE_Match object at ...>
RegexObject.split(string[, maxsplit=0])~
Identical to the split function, using the compiled pattern.
RegexObject.findall(string[, pos[, endpos]])~
Similar to the findall function, using the compiled pattern, but
also accepts optional {pos} and {endpos} parameters that limit the search
region like for match.
RegexObject.finditer(string[, pos[, endpos]])~
Similar to the finditer function, using the compiled pattern, but
also accepts optional {pos} and {endpos} parameters that limit the search
region like for match.
RegexObject.sub(repl, string[, count=0])~
Identical to the sub function, using the compiled pattern.
RegexObject.subn(repl, string[, count=0])~
Identical to the subn function, using the compiled pattern.
RegexObject.flags~
The flags argument used when the RE object was compiled, or ``0`` if no flags
were provided.
RegexObject.groups~
The number of capturing groups in the pattern.
RegexObject.groupindex~
A dictionary mapping any symbolic group names defined by ``(?P<id>)`` to group
numbers. The dictionary is empty if no symbolic groups were used in the
pattern.
RegexObject.pattern~
The pattern string from which the RE object was compiled.
Match Objects
-------------
MatchObject~
Match Objects always have a boolean value of True, so that you can test
whether e.g. match resulted in a match with a simple if statement. They
support the following methods and attributes:
MatchObject.expand(template)~
Return the string obtained by doing backslash substitution on the template
string {template}, as done by the RegexObject.sub method. Escapes
such as ``\n`` are converted to the appropriate characters, and numeric
backreferences (``\1``, ``\2``) and named backreferences (``\g<1>``,
``\g<name>``) are replaced by the contents of the corresponding group.
MatchObject.group([group1, ...])~
Returns one or more subgroups of the match. If there is a single argument, the
result is a single string; if there are multiple arguments, the result is a
tuple with one item per argument. Without arguments, {group1} defaults to zero
(the whole match is returned). If a {groupN} argument is zero, the corresponding
return value is the entire matching string; if it is in the inclusive range
[1..99], it is the string matching the corresponding parenthesized group. If a
group number is negative or larger than the number of groups defined in the
pattern, an IndexError exception is raised. If a group is contained in a
part of the pattern that did not match, the corresponding result is ``None``.
If a group is contained in a part of the pattern that matched multiple times,
the last match is returned.
>>> m = re.match(r"(\w+) (\w+)", "Isaac Newton, physicist")
>>> m.group(0) # The entire match
'Isaac Newton'
>>> m.group(1) # The first parenthesized subgroup.
'Isaac'
>>> m.group(2) # The second parenthesized subgroup.
'Newton'
>>> m.group(1, 2) # Multiple arguments give us a tuple.
('Isaac', 'Newton')
If the regular expression uses the ``(?P<name>...)`` syntax, the {groupN}
arguments may also be strings identifying groups by their group name. If a
string argument is not used as a group name in the pattern, an IndexError
exception is raised.
A moderately complicated example:
>>> m = re.match(r"(?P<first_name>\w+) (?P<last_name>\w+)", "Malcolm Reynolds")
>>> m.group('first_name')
'Malcolm'
>>> m.group('last_name')
'Reynolds'
Named groups can also be referred to by their index:
>>> m.group(1)
'Malcolm'
>>> m.group(2)
'Reynolds'
If a group matches multiple times, only the last match is accessible:
>>> m = re.match(r"(..)+", "a1b2c3") # Matches 3 times.
>>> m.group(1) # Returns only the last match.
'c3'
MatchObject.groups([default])~
Return a tuple containing all the subgroups of the match, from 1 up to however
many groups are in the pattern. The {default} argument is used for groups that
did not participate in the match; it defaults to ``None``. (Incompatibility
note: in the original Python 1.5 release, if the tuple was one element long, a
string would be returned instead. In later versions (from 1.5.1 on), a
singleton tuple is returned in such cases.)
For example:
>>> m = re.match(r"(\d+)\.(\d+)", "24.1632")
>>> m.groups()
('24', '1632')
If we make the decimal place and everything after it optional, not all groups
might participate in the match. These groups will default to ``None`` unless
the {default} argument is given:
>>> m = re.match(r"(\d+)\.?(\d+)?", "24")
>>> m.groups() # Second group defaults to None.
('24', None)
>>> m.groups('0') # Now, the second group defaults to '0'.
('24', '0')
MatchObject.groupdict([default])~
Return a dictionary containing all the {named} subgroups of the match, keyed by
the subgroup name. The {default} argument is used for groups that did not
participate in the match; it defaults to ``None``. For example:
>>> m = re.match(r"(?P<first_name>\w+) (?P<last_name>\w+)", "Malcolm Reynolds")
>>> m.groupdict()
{'first_name': 'Malcolm', 'last_name': 'Reynolds'}
MatchObject.start([group])~
MatchObject.end([group])
Return the indices of the start and end of the substring matched by {group};
{group} defaults to zero (meaning the whole matched substring). Return ``-1`` if
{group} exists but did not contribute to the match. For a match object {m}, and
a group {g} that did contribute to the match, the substring matched by group {g}
(equivalent to ``m.group(g)``) is :: >
m.string[m.start(g):m.end(g)]
<
Note that ``m.start(group)`` will equal ``m.end(group)`` if {group} matched a
null string. For example, after ``m = re.search('b(c?)', 'cba')``,
``m.start(0)`` is 1, ``m.end(0)`` is 2, ``m.start(1)`` and ``m.end(1)`` are both
2, and ``m.start(2)`` raises an IndexError exception.
An example that will remove {remove_this} from email addresses:
>>> email = "tony@tiremove_thisger.net"
>>> m = re.search("remove_this", email)
>>> email[:m.start()] + email[m.end():]
'tony@tiger.net'
MatchObject.span([group])~
For MatchObject {m}, return the 2-tuple ``(m.start(group),
m.end(group))``. Note that if {group} did not contribute to the match, this is
``(-1, -1)``. {group} defaults to zero, the entire match.
MatchObject.pos~
The value of {pos} which was passed to the RegexObject.search or
RegexObject.match method of the RegexObject. This is the
index into the string at which the RE engine started looking for a match.
MatchObject.endpos~
The value of {endpos} which was passed to the RegexObject.search or
RegexObject.match method of the RegexObject. This is the
index into the string beyond which the RE engine will not go.
MatchObject.lastindex~
The integer index of the last matched capturing group, or ``None`` if no group
was matched at all. For example, the expressions ``(a)b``, ``((a)(b))``, and
``((ab))`` will have ``lastindex == 1`` if applied to the string ``'ab'``, while
the expression ``(a)(b)`` will have ``lastindex == 2``, if applied to the same
string.
MatchObject.lastgroup~
The name of the last matched capturing group, or ``None`` if the group didn't
have a name, or if no group was matched at all.
MatchObject.re~
The regular expression object whose RegexObject.match or
RegexObject.search method produced this MatchObject
instance.
MatchObject.string~
The string passed to RegexObject.match or
RegexObject.search.
Examples
--------
Checking For a Pair
^^^^^^^^^^^^^^^^^^^
In this example, we'll use the following helper function to display match
objects a little more gracefully:
.. testcode::
def displaymatch(match):
if match is None:
return None
return '<Match: %r, groups=%r>' % (match.group(), match.groups())
Suppose you are writing a poker program where a player's hand is represented as
a 5-character string with each character representing a card, "a" for ace, "k"
for king, "q" for queen, j for jack, "0" for 10, and "1" through "9"
representing the card with that value.
To see if a given string is a valid hand, one could do the following:
>>> valid = re.compile(r"[0-9akqj]{5}$")
>>> displaymatch(valid.match("ak05q")) # Valid.
"<Match: 'ak05q', groups=()>"
>>> displaymatch(valid.match("ak05e")) # Invalid.
>>> displaymatch(valid.match("ak0")) # Invalid.
>>> displaymatch(valid.match("727ak")) # Valid.
"<Match: '727ak', groups=()>"
That last hand, ``"727ak"``, contained a pair, or two of the same valued cards.
To match this with a regular expression, one could use backreferences as such:
>>> pair = re.compile(r".{(.).}\1")
>>> displaymatch(pair.match("717ak")) # Pair of 7s.
"<Match: '717', groups=('7',)>"
>>> displaymatch(pair.match("718ak")) # No pairs.
>>> displaymatch(pair.match("354aa")) # Pair of aces.
"<Match: '354aa', groups=('a',)>"
To find out what card the pair consists of, one could use the
MatchObject.group method of MatchObject in the following
manner:
.. doctest::
>>> pair.match("717ak").group(1)
'7'
# Error because re.match() returns None, which doesn't have a group() method:
>>> pair.match("718ak").group(1)
Traceback (most recent call last):
File "<pyshell#23>", line 1, in <module>
re.match(r".{(.).}\1", "718ak").group(1)
AttributeError: 'NoneType' object has no attribute 'group'
>>> pair.match("354aa").group(1)
'a'
Simulating scanf()
^^^^^^^^^^^^^^^^^^
.. index:: single: scanf()
Python does not currently have an equivalent to scanf. Regular
expressions are generally more powerful, though also more verbose, than
scanf format strings. The table below offers some more-or-less
equivalent mappings between scanf format tokens and regular
expressions.
+--------------------------------+---------------------------------------------+
| scanf Token | Regular Expression |
+================================+=============================================+
| ``%c`` | ``.`` |
+--------------------------------+---------------------------------------------+
| ``%5c`` | ``.{5}`` |
+--------------------------------+---------------------------------------------+
| ``%d`` | ``[-+]?\d+`` |
+--------------------------------+---------------------------------------------+
| ``%e``, ``%E``, ``%f``, ``%g`` | ``[-+]?(\d+(\.\d*)?|\.\d+)([eE][-+]?\d+)?`` |
+--------------------------------+---------------------------------------------+
| ``%i`` | ``[-+]?(0[xX][\dA-Fa-f]+|0[0-7]*|\d+)`` |
+--------------------------------+---------------------------------------------+
| ``%o`` | ``0[0-7]*`` |
+--------------------------------+---------------------------------------------+
| ``%s`` | ``\S+`` |
+--------------------------------+---------------------------------------------+
| ``%u`` | ``\d+`` |
+--------------------------------+---------------------------------------------+
| ``%x``, ``%X`` | ``0[xX][\dA-Fa-f]+`` |
+--------------------------------+---------------------------------------------+
To extract the filename and numbers from a string like :: >
/usr/sbin/sendmail - 0 errors, 4 warnings
<
you would use a scanf format like ::
%s - %d errors, %d warnings
The equivalent regular expression would be :: >
(\S+) - (\d+) errors, (\d+) warnings
<
Avoiding recursion
If you create regular expressions that require the engine to perform a lot of
recursion, you may encounter a RuntimeError exception with the message
``maximum recursion limit`` exceeded. For example, :: >
>>> s = 'Begin ' + 1000*'a very long string ' + 'end'
>>> re.match('Begin (\w| )*? end', s).end()
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/local/lib/python2.5/re.py", line 132, in match
return _compile(pattern, flags).match(string)
RuntimeError: maximum recursion limit exceeded
<
You can often restructure your regular expression to avoid recursion.
Starting with Python 2.3, simple uses of the ``*?`` pattern are special-cased to
avoid recursion. Thus, the above regular expression can avoid recursion by
being recast as ``Begin [a-zA-Z0-9_ ]*?end``. As a further benefit, such
regular expressions will run faster than their recursive equivalents.
search() vs. match()
^^^^^^^^^^^^^^^^^^^^
In a nutshell, match only attempts to match a pattern at the beginning
of a string where search will match a pattern anywhere in a string.
For example:
>>> re.match("o", "dog") # No match as "o" is not the first letter of "dog".
>>> re.search("o", "dog") # Match as search() looks everywhere in the string.
<_sre.SRE_Match object at ...>
.. note::
The following applies only to regular expression objects like those created
with ``re.compile("pattern")``, not the primitives ``re.match(pattern,
string)`` or ``re.search(pattern, string)``.
match has an optional second parameter that gives an index in the string
where the search is to start:: >
>>> pattern = re.compile("o")
>>> pattern.match("dog") # No match as "o" is not at the start of "dog."
# Equivalent to the above expression as 0 is the default starting index:
>>> pattern.match("dog", 0)
# Match as "o" is the 2nd character of "dog" (index 0 is the first):
>>> pattern.match("dog", 1)
<_sre.SRE_Match object at ...>
>>> pattern.match("dog", 2) # No match as "o" is not the 3rd character of "dog."
<
Making a Phonebook
split splits a string into a list delimited by the passed pattern. The
method is invaluable for converting textual data into data structures that can be
easily read and modified by Python as demonstrated in the following example that
creates a phonebook.
First, here is the input. Normally it may come from a file, here we are using
triple-quoted string syntax:
>>> input = """Ross McFluff: 834.345.1254 155 Elm Street
...
... Ronald Heathmore: 892.345.3428 436 Finley Avenue
... Frank Burger: 925.541.7625 662 South Dogwood Way
...
...
... Heather Albrecht: 548.326.4584 919 Park Place"""
The entries are separated by one or more newlines. Now we convert the string
into a list with each nonempty line having its own entry:
.. doctest::
:options: +NORMALIZE_WHITESPACE
>>> entries = re.split("\n+", input)
>>> entries
['Ross McFluff: 834.345.1254 155 Elm Street',
'Ronald Heathmore: 892.345.3428 436 Finley Avenue',
'Frank Burger: 925.541.7625 662 South Dogwood Way',
'Heather Albrecht: 548.326.4584 919 Park Place']
Finally, split each entry into a list with first name, last name, telephone
number, and address. We use the ``maxsplit`` parameter of split
because the address has spaces, our splitting pattern, in it:
.. doctest::
:options: +NORMALIZE_WHITESPACE
>>> [re.split(":? ", entry, 3) for entry in entries]
[['Ross', 'McFluff', '834.345.1254', '155 Elm Street'],
['Ronald', 'Heathmore', '892.345.3428', '436 Finley Avenue'],
['Frank', 'Burger', '925.541.7625', '662 South Dogwood Way'],
['Heather', 'Albrecht', '548.326.4584', '919 Park Place']]
The ``:?`` pattern matches the colon after the last name, so that it does not
occur in the result list. With a ``maxsplit`` of ``4``, we could separate the
house number from the street name:
.. doctest::
:options: +NORMALIZE_WHITESPACE
>>> [re.split(":? ", entry, 4) for entry in entries]
[['Ross', 'McFluff', '834.345.1254', '155', 'Elm Street'],
['Ronald', 'Heathmore', '892.345.3428', '436', 'Finley Avenue'],
['Frank', 'Burger', '925.541.7625', '662', 'South Dogwood Way'],
['Heather', 'Albrecht', '548.326.4584', '919', 'Park Place']]
Text Munging
^^^^^^^^^^^^
sub replaces every occurrence of a pattern with a string or the
result of a function. This example demonstrates using sub with
a function to "munge" text, or randomize the order of all the characters
in each word of a sentence except for the first and last characters:: >
>>> def repl(m):
... inner_word = list(m.group(2))
... random.shuffle(inner_word)
... return m.group(1) + "".join(inner_word) + m.group(3)
>>> text = "Professor Abdolmalek, please report your absences promptly."
>>> re.sub("(\w)(\w+)(\w)", repl, text)
'Poefsrosr Aealmlobdk, pslaee reorpt your abnseces plmrptoy.'
>>> re.sub("(\w)(\w+)(\w)", repl, text)
'Pofsroser Aodlambelk, plasee reoprt yuor asnebces potlmrpy.'
<
Finding all Adverbs
findall matches {all} occurrences of a pattern, not just the first
one as search does. For example, if one was a writer and wanted to
find all of the adverbs in some text, he or she might use findall in
the following manner:
>>> text = "He was carefully disguised but captured quickly by police."
>>> re.findall(r"\w+ly", text)
['carefully', 'quickly']
Finding all Adverbs and their Positions
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If one wants more information about all matches of a pattern than the matched
text, finditer is useful as it provides instances of
MatchObject instead of strings. Continuing with the previous example,
if one was a writer who wanted to find all of the adverbs {and their positions}
in some text, he or she would use finditer in the following manner:
>>> text = "He was carefully disguised but captured quickly by police."
>>> for m in re.finditer(r"\w+ly", text):
... print '%02d-%02d: %s' % (m.start(), m.end(), m.group(0))
07-16: carefully
40-47: quickly
Raw String Notation
^^^^^^^^^^^^^^^^^^^
Raw string notation (``r"text"``) keeps regular expressions sane. Without it,
every backslash (``'\'``) in a regular expression would have to be prefixed with
another one to escape it. For example, the two following lines of code are
functionally identical:
>>> re.match(r"\W(.)\1\W", " ff ")
<_sre.SRE_Match object at ...>
>>> re.match("\\W(.)\\1\\W", " ff ")
<_sre.SRE_Match object at ...>
When one wants to match a literal backslash, it must be escaped in the regular
expression. With raw string notation, this means ``r"\\"``. Without raw string
notation, one must use ``"\\\\"``, making the following lines of code
functionally identical:
>>> re.match(r"\\", r"\\")
<_sre.SRE_Match object at ...>
>>> re.match("\\\\", r"\\")
<_sre.SRE_Match object at ...>
==============================================================================
*py2stdlib-readline*
readline~
:platform: Unix
:synopsis: GNU readline support for Python.
The readline (|py2stdlib-readline|) module defines a number of functions to facilitate
completion and reading/writing of history files from the Python interpreter.
This module can be used directly or via the rlcompleter (|py2stdlib-rlcompleter|) module. Settings
made using this module affect the behaviour of both the interpreter's
interactive prompt and the prompts offered by the raw_input and
input built-in functions.
.. note::
On MacOS X the readline (|py2stdlib-readline|) module can be implemented using
the ``libedit`` library instead of GNU readline.
The configuration file for ``libedit`` is different from that
of GNU readline. If you programmaticly load configuration strings
you can check for the text "libedit" in readline.__doc__
to differentiate between GNU readline and libedit.
The readline (|py2stdlib-readline|) module defines the following functions:
parse_and_bind(string)~
Parse and execute single line of a readline init file.
get_line_buffer()~
Return the current contents of the line buffer.
insert_text(string)~
Insert text into the command line.
read_init_file([filename])~
Parse a readline initialization file. The default filename is the last filename
used.
read_history_file([filename])~
Load a readline history file. The default filename is /.history.
write_history_file([filename])~
Save a readline history file. The default filename is /.history.
clear_history()~
Clear the current history. (Note: this function is not available if the
installed version of GNU readline doesn't support it.)
.. versionadded:: 2.4
get_history_length()~
Return the desired length of the history file. Negative values imply unlimited
history file size.
set_history_length(length)~
Set the number of lines to save in the history file. write_history_file
uses this value to truncate the history file when saving. Negative values imply
unlimited history file size.
get_current_history_length()~
Return the number of lines currently in the history. (This is different from
get_history_length, which returns the maximum number of lines that will
be written to a history file.)
.. versionadded:: 2.3
get_history_item(index)~
Return the current contents of history item at {index}.
.. versionadded:: 2.3
remove_history_item(pos)~
Remove history item specified by its position from the history.
.. versionadded:: 2.4
replace_history_item(pos, line)~
Replace history item specified by its position with the given line.
.. versionadded:: 2.4
redisplay()~
Change what's displayed on the screen to reflect the current contents of the
line buffer.
.. versionadded:: 2.3
set_startup_hook([function])~
Set or remove the startup_hook function. If {function} is specified, it will be
used as the new startup_hook function; if omitted or ``None``, any hook function
already installed is removed. The startup_hook function is called with no
arguments just before readline prints the first prompt.
set_pre_input_hook([function])~
Set or remove the pre_input_hook function. If {function} is specified, it will
be used as the new pre_input_hook function; if omitted or ``None``, any hook
function already installed is removed. The pre_input_hook function is called
with no arguments after the first prompt has been printed and just before
readline starts reading input characters.
set_completer([function])~
Set or remove the completer function. If {function} is specified, it will be
used as the new completer function; if omitted or ``None``, any completer
function already installed is removed. The completer function is called as
``function(text, state)``, for {state} in ``0``, ``1``, ``2``, ..., until it
returns a non-string value. It should return the next possible completion
starting with {text}.
get_completer()~
Get the completer function, or ``None`` if no completer function has been set.
.. versionadded:: 2.3
get_completion_type()~
Get the type of completion being attempted.
.. versionadded:: 2.6
get_begidx()~
Get the beginning index of the readline tab-completion scope.
get_endidx()~
Get the ending index of the readline tab-completion scope.
set_completer_delims(string)~
Set the readline word delimiters for tab-completion.
get_completer_delims()~
Get the readline word delimiters for tab-completion.
set_completion_display_matches_hook([function])~
Set or remove the completion display function. If {function} is
specified, it will be used as the new completion display function;
if omitted or ``None``, any completion display function already
installed is removed. The completion display function is called as
``function(substitution, [matches], longest_match_length)`` once
each time matches need to be displayed.
.. versionadded:: 2.6
add_history(line)~
Append a line to the history buffer, as if it was the last line typed.
.. seealso::
Module rlcompleter (|py2stdlib-rlcompleter|)
Completion of Python identifiers at the interactive prompt.
Example
-------
The following example demonstrates how to use the readline (|py2stdlib-readline|) module's
history reading and writing functions to automatically load and save a history
file named .pyhist from the user's home directory. The code below would
normally be executed automatically during interactive sessions from the user's
PYTHONSTARTUP file. :: >
import os
histfile = os.path.join(os.environ["HOME"], ".pyhist")
try:
readline.read_history_file(histfile)
except IOError:
pass
import atexit
atexit.register(readline.write_history_file, histfile)
del os, histfile
<
The following example extends the code.InteractiveConsole class to
support history save/restore. :: >
import code
import readline
import atexit
import os
class HistoryConsole(code.InteractiveConsole):
def __init__(self, locals=None, filename="<console>",
histfile=os.path.expanduser("~/.console-history")):
code.InteractiveConsole.__init__(self, locals, filename)
self.init_history(histfile)
def init_history(self, histfile):
readline.parse_and_bind("tab: complete")
if hasattr(readline, "read_history_file"):
try:
readline.read_history_file(histfile)
except IOError:
pass
atexit.register(self.save_history, histfile)
def save_history(self, histfile):
readline.write_history_file(histfile)
==============================================================================
*py2stdlib-repr*
repr~
:synopsis: Alternate repr() implementation with size limits.
.. note::
The repr (|py2stdlib-repr|) module has been renamed to reprlib in Python 3.0. The
2to3 tool will automatically adapt imports when converting your
sources to 3.0.
The repr (|py2stdlib-repr|) module provides a means for producing object representations
with limits on the size of the resulting strings. This is used in the Python
debugger and may be useful in other contexts as well.
This module provides a class, an instance, and a function:
Repr()~
Class which provides formatting services useful in implementing functions
similar to the built-in repr (|py2stdlib-repr|); size limits for different object types
are added to avoid the generation of representations which are excessively long.
aRepr~
This is an instance of Repr which is used to provide the .repr
function described below. Changing the attributes of this object will affect
the size limits used by .repr and the Python debugger.
repr(obj)~
This is the Repr.repr method of ``aRepr``. It returns a string
similar to that returned by the built-in function of the same name, but with
limits on most sizes.
Repr Objects
------------
Repr instances provide several members which can be used to provide
size limits for the representations of different object types, and methods
which format specific object types.
Repr.maxlevel~
Depth limit on the creation of recursive representations. The default is ``6``.
Repr.maxdict~
Repr.maxlist
Repr.maxtuple
Repr.maxset
Repr.maxfrozenset
Repr.maxdeque
Repr.maxarray
Limits on the number of entries represented for the named object type. The
default is ``4`` for maxdict, ``5`` for maxarray, and ``6`` for
the others.
.. versionadded:: 2.4
maxset, maxfrozenset, and set.
Repr.maxlong~
Maximum number of characters in the representation for a long integer. Digits
are dropped from the middle. The default is ``40``.
Repr.maxstring~
Limit on the number of characters in the representation of the string. Note
that the "normal" representation of the string is used as the character source:
if escape sequences are needed in the representation, these may be mangled when
the representation is shortened. The default is ``30``.
Repr.maxother~
This limit is used to control the size of object types for which no specific
formatting method is available on the Repr object. It is applied in a
similar manner as maxstring. The default is ``20``.
Repr.repr(obj)~
The equivalent to the built-in repr (|py2stdlib-repr|) that uses the formatting imposed by
the instance.
Repr.repr1(obj, level)~
Recursive implementation used by .repr. This uses the type of {obj} to
determine which formatting method to call, passing it {obj} and {level}. The
type-specific methods should call repr1 to perform recursive formatting,
with ``level - 1`` for the value of {level} in the recursive call.
Repr.repr_TYPE(obj, level)~
Formatting methods for specific types are implemented as methods with a name
based on the type name. In the method name, {TYPE}* is replaced by
``string.join(string.split(type(obj).__name__, '_'))``. Dispatch to these
methods is handled by repr1. Type-specific methods which need to
recursively format a value should call ``self.repr1(subobj, level - 1)``.
Subclassing Repr Objects
------------------------
The use of dynamic dispatching by Repr.repr1 allows subclasses of
Repr to add support for additional built-in object types or to modify
the handling of types already supported. This example shows how special support
for file objects could be added:: >
import repr as reprlib
import sys
class MyRepr(reprlib.Repr):
def repr_file(self, obj, level):
if obj.name in ['<stdin>', '<stdout>', '<stderr>']:
return obj.name
else:
return repr(obj)
aRepr = MyRepr()
print aRepr.repr(sys.stdin) # prints '<stdin>'
==============================================================================
*py2stdlib-resource*
resource~
:platform: Unix
:synopsis: An interface to provide resource usage information on the current process.
This module provides basic mechanisms for measuring and controlling system
resources utilized by a program.
Symbolic constants are used to specify particular system resources and to
request usage information about either the current process or its children.
A single exception is defined for errors:
error~
The functions described below may raise this error if the underlying system call
failures unexpectedly.
Resource Limits
---------------
Resources usage can be limited using the setrlimit function described
below. Each resource is controlled by a pair of limits: a soft limit and a hard
limit. The soft limit is the current limit, and may be lowered or raised by a
process over time. The soft limit can never exceed the hard limit. The hard
limit can be lowered to any value greater than the soft limit, but not raised.
(Only processes with the effective UID of the super-user can raise a hard
limit.)
The specific resources that can be limited are system dependent. They are
described in the getrlimit(2) man page. The resources listed below
are supported when the underlying operating system supports them; resources
which cannot be checked or controlled by the operating system are not defined in
this module for those platforms.
getrlimit(resource)~
Returns a tuple ``(soft, hard)`` with the current soft and hard limits of
{resource}. Raises ValueError if an invalid resource is specified, or
error if the underlying system call fails unexpectedly.
setrlimit(resource, limits)~
Sets new limits of consumption of {resource}. The {limits} argument must be a
tuple ``(soft, hard)`` of two integers describing the new limits. A value of
``-1`` can be used to specify the maximum possible upper limit.
Raises ValueError if an invalid resource is specified, if the new soft
limit exceeds the hard limit, or if a process tries to raise its hard limit
(unless the process has an effective UID of super-user). Can also raise
error if the underlying system call fails.
These symbols define resources whose consumption can be controlled using the
setrlimit and getrlimit functions described below. The values of
these symbols are exactly the constants used by C programs.
The Unix man page for getrlimit(2) lists the available resources.
Note that not all systems use the same symbol or same value to denote the same
resource. This module does not attempt to mask platform differences --- symbols
not defined for a platform will not be available from this module on that
platform.
RLIMIT_CORE~
The maximum size (in bytes) of a core file that the current process can create.
This may result in the creation of a partial core file if a larger core would be
required to contain the entire process image.
RLIMIT_CPU~
The maximum amount of processor time (in seconds) that a process can use. If
this limit is exceeded, a SIGXCPU signal is sent to the process. (See
the signal (|py2stdlib-signal|) module documentation for information about how to catch this
signal and do something useful, e.g. flush open files to disk.)
RLIMIT_FSIZE~
The maximum size of a file which the process may create. This only affects the
stack of the main thread in a multi-threaded process.
RLIMIT_DATA~
The maximum size (in bytes) of the process's heap.
RLIMIT_STACK~
The maximum size (in bytes) of the call stack for the current process.
RLIMIT_RSS~
The maximum resident set size that should be made available to the process.
RLIMIT_NPROC~
The maximum number of processes the current process may create.
RLIMIT_NOFILE~
The maximum number of open file descriptors for the current process.
RLIMIT_OFILE~
The BSD name for RLIMIT_NOFILE.
RLIMIT_MEMLOCK~
The maximum address space which may be locked in memory.
RLIMIT_VMEM~
The largest area of mapped memory which the process may occupy.
RLIMIT_AS~
The maximum area (in bytes) of address space which may be taken by the process.
Resource Usage
--------------
These functions are used to retrieve resource usage information:
getrusage(who)~
This function returns an object that describes the resources consumed by either
the current process or its children, as specified by the {who} parameter. The
{who} parameter should be specified using one of the RUSAGE_\*
constants described below.
The fields of the return value each describe how a particular system resource
has been used, e.g. amount of time spent running is user mode or number of times
the process was swapped out of main memory. Some values are dependent on the
clock tick internal, e.g. the amount of memory the process is using.
For backward compatibility, the return value is also accessible as a tuple of 16
elements.
The fields ru_utime and ru_stime of the return value are
floating point values representing the amount of time spent executing in user
mode and the amount of time spent executing in system mode, respectively. The
remaining values are integers. Consult the getrusage(2) man page for
detailed information about these values. A brief summary is presented here:
+--------+---------------------+-------------------------------+
| Index | Field | Resource |
+========+=====================+===============================+
| ``0`` | ru_utime | time in user mode (float) |
+--------+---------------------+-------------------------------+
| ``1`` | ru_stime | time in system mode (float) |
+--------+---------------------+-------------------------------+
| ``2`` | ru_maxrss | maximum resident set size |
+--------+---------------------+-------------------------------+
| ``3`` | ru_ixrss | shared memory size |
+--------+---------------------+-------------------------------+
| ``4`` | ru_idrss | unshared memory size |
+--------+---------------------+-------------------------------+
| ``5`` | ru_isrss | unshared stack size |
+--------+---------------------+-------------------------------+
| ``6`` | ru_minflt | page faults not requiring I/O |
+--------+---------------------+-------------------------------+
| ``7`` | ru_majflt | page faults requiring I/O |
+--------+---------------------+-------------------------------+
| ``8`` | ru_nswap | number of swap outs |
+--------+---------------------+-------------------------------+
| ``9`` | ru_inblock | block input operations |
+--------+---------------------+-------------------------------+
| ``10`` | ru_oublock | block output operations |
+--------+---------------------+-------------------------------+
| ``11`` | ru_msgsnd | messages sent |
+--------+---------------------+-------------------------------+
| ``12`` | ru_msgrcv | messages received |
+--------+---------------------+-------------------------------+
| ``13`` | ru_nsignals | signals received |
+--------+---------------------+-------------------------------+
| ``14`` | ru_nvcsw | voluntary context switches |
+--------+---------------------+-------------------------------+
| ``15`` | ru_nivcsw | involuntary context switches |
+--------+---------------------+-------------------------------+
This function will raise a ValueError if an invalid {who} parameter is
specified. It may also raise error exception in unusual circumstances.
.. versionchanged:: 2.3
Added access to values as attributes of the returned object.
getpagesize()~
Returns the number of bytes in a system page. (This need not be the same as the
hardware page size.) This function is useful for determining the number of bytes
of memory a process is using. The third element of the tuple returned by
getrusage describes memory usage in pages; multiplying by page size
produces number of bytes.
The following RUSAGE_\* symbols are passed to the getrusage
function to specify which processes information should be provided for.
RUSAGE_SELF~
RUSAGE_SELF should be used to request information pertaining only to
the process itself.
RUSAGE_CHILDREN~
Pass to getrusage to request resource information for child processes of
the calling process.
RUSAGE_BOTH~
Pass to getrusage to request resources consumed by both the current
process and child processes. May not be available on all systems.
==============================================================================
*py2stdlib-rexec*
rexec~
:synopsis: Basic restricted execution framework.
:deprecated:
2.6~
The rexec (|py2stdlib-rexec|) module has been removed in Python 3.0.
.. versionchanged:: 2.3
Disabled module.
.. warning::
The documentation has been left in place to help in reading old code that uses
the module.
This module contains the RExec class, which supports r_eval,
r_execfile, r_exec, and r_import methods, which are
restricted versions of the standard Python functions eval,
execfile and the exec and import statements. Code
executed in this restricted environment will only have access to modules and
functions that are deemed safe; you can subclass RExec to add or remove
capabilities as desired.
.. warning::
While the rexec (|py2stdlib-rexec|) module is designed to perform as described below, it does
have a few known vulnerabilities which could be exploited by carefully written
code. Thus it should not be relied upon in situations requiring "production
ready" security. In such situations, execution via sub-processes or very
careful "cleansing" of both code and data to be processed may be necessary.
Alternatively, help in patching known rexec (|py2stdlib-rexec|) vulnerabilities would be
welcomed.
.. note::
The RExec class can prevent code from performing unsafe operations like
reading or writing disk files, or using TCP/IP sockets. However, it does not
protect against code using extremely large amounts of memory or processor time.
RExec([hooks[, verbose]])~
Returns an instance of the RExec class.
{hooks} is an instance of the RHooks class or a subclass of it. If it
is omitted or ``None``, the default RHooks class is instantiated.
Whenever the rexec (|py2stdlib-rexec|) module searches for a module (even a built-in one) or
reads a module's code, it doesn't actually go out to the file system itself.
Rather, it calls methods of an RHooks instance that was passed to or
created by its constructor. (Actually, the RExec object doesn't make
these calls --- they are made by a module loader object that's part of the
RExec object. This allows another level of flexibility, which can be
useful when changing the mechanics of import within the restricted
environment.)
By providing an alternate RHooks object, we can control the file system
accesses made to import a module, without changing the actual algorithm that
controls the order in which those accesses are made. For instance, we could
substitute an RHooks object that passes all filesystem requests to a
file server elsewhere, via some RPC mechanism such as ILU. Grail's applet
loader uses this to support importing applets from a URL for a directory.
If {verbose} is true, additional debugging output may be sent to standard
output.
It is important to be aware that code running in a restricted environment can
still call the sys.exit function. To disallow restricted code from
exiting the interpreter, always protect calls that cause restricted code to run
with a try/except statement that catches the
SystemExit exception. Removing the sys.exit function from the
restricted environment is not sufficient --- the restricted code could still use
``raise SystemExit``. Removing SystemExit is not a reasonable option;
some library code makes use of this and would break were it not available.
.. seealso::
`Grail Home Page <http://grail.sourceforge.net/>`_
Grail is a Web browser written entirely in Python. It uses the rexec (|py2stdlib-rexec|)
module as a foundation for supporting Python applets, and can be used as an
example usage of this module.
RExec Objects
-------------
RExec instances support the following methods:
RExec.r_eval(code)~
{code} must either be a string containing a Python expression, or a compiled
code object, which will be evaluated in the restricted environment's
__main__ (|py2stdlib-__main__|) module. The value of the expression or code object will be
returned.
RExec.r_exec(code)~
{code} must either be a string containing one or more lines of Python code, or a
compiled code object, which will be executed in the restricted environment's
__main__ (|py2stdlib-__main__|) module.
RExec.r_execfile(filename)~
Execute the Python code contained in the file {filename} in the restricted
environment's __main__ (|py2stdlib-__main__|) module.
Methods whose names begin with ``s_`` are similar to the functions beginning
with ``r_``, but the code will be granted access to restricted versions of the
standard I/O streams ``sys.stdin``, ``sys.stderr``, and ``sys.stdout``.
RExec.s_eval(code)~
{code} must be a string containing a Python expression, which will be evaluated
in the restricted environment.
RExec.s_exec(code)~
{code} must be a string containing one or more lines of Python code, which will
be executed in the restricted environment.
RExec.s_execfile(code)~
Execute the Python code contained in the file {filename} in the restricted
environment.
RExec objects must also support various methods which will be
implicitly called by code executing in the restricted environment. Overriding
these methods in a subclass is used to change the policies enforced by a
restricted environment.
RExec.r_import(modulename[, globals[, locals[, fromlist]]])~
Import the module {modulename}, raising an ImportError exception if the
module is considered unsafe.
RExec.r_open(filename[, mode[, bufsize]])~
Method called when open is called in the restricted environment. The
arguments are identical to those of open, and a file object (or a class
instance compatible with file objects) should be returned. RExec's
default behaviour is allow opening any file for reading, but forbidding any
attempt to write a file. See the example below for an implementation of a less
restrictive r_open.
RExec.r_reload(module)~
Reload the module object {module}, re-parsing and re-initializing it.
RExec.r_unload(module)~
Unload the module object {module} (remove it from the restricted environment's
``sys.modules`` dictionary).
And their equivalents with access to restricted standard I/O streams:
RExec.s_import(modulename[, globals[, locals[, fromlist]]])~
Import the module {modulename}, raising an ImportError exception if the
module is considered unsafe.
RExec.s_reload(module)~
Reload the module object {module}, re-parsing and re-initializing it.
RExec.s_unload(module)~
Unload the module object {module}.
.. XXX what are the semantics of this?
Defining restricted environments
--------------------------------
The RExec class has the following class attributes, which are used by
the __init__ method. Changing them on an existing instance won't have
any effect; instead, create a subclass of RExec and assign them new
values in the class definition. Instances of the new class will then use those
new values. All these attributes are tuples of strings.
RExec.nok_builtin_names~
Contains the names of built-in functions which will {not} be available to
programs running in the restricted environment. The value for RExec is
``('open', 'reload', '__import__')``. (This gives the exceptions, because by far
the majority of built-in functions are harmless. A subclass that wants to
override this variable should probably start with the value from the base class
and concatenate additional forbidden functions --- when new dangerous built-in
functions are added to Python, they will also be added to this module.)
RExec.ok_builtin_modules~
Contains the names of built-in modules which can be safely imported. The value
for RExec is ``('audioop', 'array', 'binascii', 'cmath', 'errno',
'imageop', 'marshal', 'math', 'md5', 'operator', 'parser', 'regex', 'select',
'sha', '_sre', 'strop', 'struct', 'time')``. A similar remark about overriding
this variable applies --- use the value from the base class as a starting point.
RExec.ok_path~
Contains the directories which will be searched when an import is
performed in the restricted environment. The value for RExec is the
same as ``sys.path`` (at the time the module is loaded) for unrestricted code.
RExec.ok_posix_names~
Contains the names of the functions in the os (|py2stdlib-os|) module which will be
available to programs running in the restricted environment. The value for
RExec is ``('error', 'fstat', 'listdir', 'lstat', 'readlink', 'stat',
'times', 'uname', 'getpid', 'getppid', 'getcwd', 'getuid', 'getgid', 'geteuid',
'getegid')``.
.. Should this be called ok_os_names?
RExec.ok_sys_names~
Contains the names of the functions and variables in the sys (|py2stdlib-sys|) module which
will be available to programs running in the restricted environment. The value
for RExec is ``('ps1', 'ps2', 'copyright', 'version', 'platform',
'exit', 'maxint')``.
RExec.ok_file_types~
Contains the file types from which modules are allowed to be loaded. Each file
type is an integer constant defined in the imp (|py2stdlib-imp|) module. The meaningful
values are PY_SOURCE, PY_COMPILED, and C_EXTENSION.
The value for RExec is ``(C_EXTENSION, PY_SOURCE)``. Adding
PY_COMPILED in subclasses is not recommended; an attacker could exit
the restricted execution mode by putting a forged byte-compiled file
(.pyc) anywhere in your file system, for example by writing it to
/tmp or uploading it to the /incoming directory of your public
FTP server.
An example
----------
Let us say that we want a slightly more relaxed policy than the standard
RExec class. For example, if we're willing to allow files in
/tmp to be written, we can subclass the RExec class:: >
class TmpWriterRExec(rexec.RExec):
def r_open(self, file, mode='r', buf=-1):
if mode in ('r', 'rb'):
pass
elif mode in ('w', 'wb', 'a', 'ab'):
# check filename : must begin with /tmp/
if file[:5]!='/tmp/':
raise IOError("can't write outside /tmp")
elif (string.find(file, '/../') >= 0 or
file[:3] == '../' or file[-3:] == '/..'):
raise IOError("'..' in filename forbidden")
else: raise IOError("Illegal open() mode")
return open(file, mode, buf)
<
Notice that the above code will occasionally forbid a perfectly valid filename;
for example, code in the restricted environment won't be able to open a file
called /tmp/foo/../bar. To fix this, the r_open method would
have to simplify the filename to /tmp/bar, which would require splitting
apart the filename and performing various operations on it. In cases where
security is at stake, it may be preferable to write simple code which is
sometimes overly restrictive, instead of more general code that is also more
complex and may harbor a subtle security hole.
==============================================================================
*py2stdlib-rfc822*
rfc822~
:synopsis: Parse 2822 style mail messages.
:deprecated:
2.3~
The email (|py2stdlib-email|) package should be used in preference to the rfc822 (|py2stdlib-rfc822|)
module. This module is present only to maintain backward compatibility, and
has been removed in 3.0.
This module defines a class, Message, which represents an "email
message" as defined by the Internet standard 2822. [#]_ Such messages
consist of a collection of message headers, and a message body. This module
also defines a helper class AddressList for parsing 2822
addresses. Please refer to the RFC for information on the specific syntax of
2822 messages.
.. index:: module: mailbox
The mailbox (|py2stdlib-mailbox|) module provides classes to read mailboxes produced by
various end-user mail programs.
Message(file[, seekable])~
A Message instance is instantiated with an input object as parameter.
Message relies only on the input object having a readline (|py2stdlib-readline|) method; in
particular, ordinary file objects qualify. Instantiation reads headers from the
input object up to a delimiter line (normally a blank line) and stores them in
the instance. The message body, following the headers, is not consumed.
This class can work with any input object that supports a readline (|py2stdlib-readline|)
method. If the input object has seek and tell capability, the
rewindbody method will work; also, illegal lines will be pushed back
onto the input stream. If the input object lacks seek but has an unread
method that can push back a line of input, Message will use that to
push back illegal lines. Thus this class can be used to parse messages coming
from a buffered stream.
The optional {seekable} argument is provided as a workaround for certain stdio
libraries in which tell discards buffered data before discovering that
the lseek system call doesn't work. For maximum portability, you
should set the seekable argument to zero to prevent that initial tell
when passing in an unseekable object such as a file object created from a socket
object.
Input lines as read from the file may either be terminated by CR-LF or by a
single linefeed; a terminating CR-LF is replaced by a single linefeed before the
line is stored.
All header matching is done independent of upper or lower case; e.g.
``m['From']``, ``m['from']`` and ``m['FROM']`` all yield the same result.
AddressList(field)~
You may instantiate the AddressList helper class using a single string
parameter, a comma-separated list of 2822 addresses to be parsed. (The
parameter ``None`` yields an empty list.)
quote(str)~
Return a new string with backslashes in {str} replaced by two backslashes and
double quotes replaced by backslash-double quote.
unquote(str)~
Return a new string which is an {unquoted} version of {str}. If {str} ends and
begins with double quotes, they are stripped off. Likewise if {str} ends and
begins with angle brackets, they are stripped off.
parseaddr(address)~
Parse {address}, which should be the value of some address-containing field such
as To or Cc, into its constituent "realname" and
"email address" parts. Returns a tuple of that information, unless the parse
fails, in which case a 2-tuple ``(None, None)`` is returned.
dump_address_pair(pair)~
The inverse of parseaddr, this takes a 2-tuple of the form ``(realname,
email_address)`` and returns the string value suitable for a To or
Cc header. If the first element of {pair} is false, then the
second element is returned unmodified.
parsedate(date)~
Attempts to parse a date according to the rules in 2822. however, some
mailers don't follow that format as specified, so parsedate tries to
guess correctly in such cases. {date} is a string containing an 2822
date, such as ``'Mon, 20 Nov 1995 19:12:08 -0500'``. If it succeeds in parsing
the date, parsedate returns a 9-tuple that can be passed directly to
time.mktime; otherwise ``None`` will be returned. Note that indexes 6,
7, and 8 of the result tuple are not usable.
parsedate_tz(date)~
Performs the same function as parsedate, but returns either ``None`` or
a 10-tuple; the first 9 elements make up a tuple that can be passed directly to
time.mktime, and the tenth is the offset of the date's timezone from UTC
(which is the official term for Greenwich Mean Time). (Note that the sign of
the timezone offset is the opposite of the sign of the ``time.timezone``
variable for the same timezone; the latter variable follows the POSIX standard
while this module follows 2822.) If the input string has no timezone,
the last element of the tuple returned is ``None``. Note that indexes 6, 7, and
8 of the result tuple are not usable.
mktime_tz(tuple)~
Turn a 10-tuple as returned by parsedate_tz into a UTC timestamp. If
the timezone item in the tuple is ``None``, assume local time. Minor
deficiency: this first interprets the first 8 elements as a local time and then
compensates for the timezone difference; this may yield a slight error around
daylight savings time switch dates. Not enough to worry about for common use.
.. seealso::
Module email (|py2stdlib-email|)
Comprehensive email handling package; supersedes the rfc822 (|py2stdlib-rfc822|) module.
Module mailbox (|py2stdlib-mailbox|)
Classes to read various mailbox formats produced by end-user mail programs.
Module mimetools (|py2stdlib-mimetools|)
Subclass of rfc822.Message that handles MIME encoded messages.
Message Objects
---------------
A Message instance has the following methods:
Message.rewindbody()~
Seek to the start of the message body. This only works if the file object is
seekable.
Message.isheader(line)~
Returns a line's canonicalized fieldname (the dictionary key that will be used
to index it) if the line is a legal 2822 header; otherwise returns
``None`` (implying that parsing should stop here and the line be pushed back on
the input stream). It is sometimes useful to override this method in a
subclass.
Message.islast(line)~
Return true if the given line is a delimiter on which Message should stop. The
delimiter line is consumed, and the file object's read location positioned
immediately after it. By default this method just checks that the line is
blank, but you can override it in a subclass.
Message.iscomment(line)~
Return ``True`` if the given line should be ignored entirely, just skipped. By
default this is a stub that always returns ``False``, but you can override it in
a subclass.
Message.getallmatchingheaders(name)~
Return a list of lines consisting of all headers matching {name}, if any. Each
physical line, whether it is a continuation line or not, is a separate list
item. Return the empty list if no header matches {name}.
Message.getfirstmatchingheader(name)~
Return a list of lines comprising the first header matching {name}, and its
continuation line(s), if any. Return ``None`` if there is no header matching
{name}.
Message.getrawheader(name)~
Return a single string consisting of the text after the colon in the first
header matching {name}. This includes leading whitespace, the trailing
linefeed, and internal linefeeds and whitespace if there any continuation
line(s) were present. Return ``None`` if there is no header matching {name}.
Message.getheader(name[, default])~
Return a single string consisting of the last header matching {name},
but strip leading and trailing whitespace.
Internal whitespace is not stripped. The optional {default} argument can be
used to specify a different default to be returned when there is no header
matching {name}; it defaults to ``None``.
This is the preferred way to get parsed headers.
Message.get(name[, default])~
An alias for getheader, to make the interface more compatible with
regular dictionaries.
Message.getaddr(name)~
Return a pair ``(full name, email address)`` parsed from the string returned by
``getheader(name)``. If no header matching {name} exists, return ``(None,
None)``; otherwise both the full name and the address are (possibly empty)
strings.
Example: If {m}'s first From header contains the string
``'jack@cwi.nl (Jack Jansen)'``, then ``m.getaddr('From')`` will yield the pair
``('Jack Jansen', 'jack@cwi.nl')``. If the header contained ``'Jack Jansen
<jack@cwi.nl>'`` instead, it would yield the exact same result.
Message.getaddrlist(name)~
This is similar to ``getaddr(list)``, but parses a header containing a list of
email addresses (e.g. a To header) and returns a list of ``(full
name, email address)`` pairs (even if there was only one address in the header).
If there is no header matching {name}, return an empty list.
If multiple headers exist that match the named header (e.g. if there are several
Cc headers), all are parsed for addresses. Any continuation lines
the named headers contain are also parsed.
Message.getdate(name)~
Retrieve a header using getheader and parse it into a 9-tuple compatible
with time.mktime; note that fields 6, 7, and 8 are not usable. If
there is no header matching {name}, or it is unparsable, return ``None``.
Date parsing appears to be a black art, and not all mailers adhere to the
standard. While it has been tested and found correct on a large collection of
email from many sources, it is still possible that this function may
occasionally yield an incorrect result.
Message.getdate_tz(name)~
Retrieve a header using getheader and parse it into a 10-tuple; the
first 9 elements will make a tuple compatible with time.mktime, and the
10th is a number giving the offset of the date's timezone from UTC. Note that
fields 6, 7, and 8 are not usable. Similarly to getdate, if there is
no header matching {name}, or it is unparsable, return ``None``.
Message instances also support a limited mapping interface. In
particular: ``m[name]`` is like ``m.getheader(name)`` but raises KeyError
if there is no matching header; and ``len(m)``, ``m.get(name[, default])``,
``name in m``, ``m.keys()``, ``m.values()`` ``m.items()``, and
``m.setdefault(name[, default])`` act as expected, with the one difference
that setdefault uses an empty string as the default value.
Message instances also support the mapping writable interface ``m[name]
= value`` and ``del m[name]``. Message objects do not support the
clear, copy (|py2stdlib-copy|), popitem, or update methods of the
mapping interface. (Support for get and setdefault was only
added in Python 2.2.)
Finally, Message instances have some public instance variables:
Message.headers~
A list containing the entire set of header lines, in the order in which they
were read (except that setitem calls may disturb this order). Each line contains
a trailing newline. The blank line terminating the headers is not contained in
the list.
Message.fp~
The file or file-like object passed at instantiation time. This can be used to
read the message content.
Message.unixfrom~
The Unix ``From`` line, if the message had one, or an empty string. This is
needed to regenerate the message in some contexts, such as an ``mbox``\ -style
mailbox file.
AddressList Objects
-------------------
An AddressList instance has the following methods:
AddressList.__len__()~
Return the number of addresses in the address list.
AddressList.__str__()~
Return a canonicalized string representation of the address list. Addresses are
rendered in "name" <host@domain> form, comma-separated.
AddressList.__add__(alist)~
Return a new AddressList instance that contains all addresses in both
AddressList operands, with duplicates removed (set union).
AddressList.__iadd__(alist)~
In-place version of __add__; turns this AddressList instance
into the union of itself and the right-hand instance, {alist}.
AddressList.__sub__(alist)~
Return a new AddressList instance that contains every address in the
left-hand AddressList operand that is not present in the right-hand
address operand (set difference).
AddressList.__isub__(alist)~
In-place version of __sub__, removing addresses in this list which are
also in {alist}.
Finally, AddressList instances have one public instance variable:
AddressList.addresslist~
A list of tuple string pairs, one per address. In each member, the first is the
canonicalized name part, the second is the actual route-address (``'@'``\
-separated username-host.domain pair).
.. rubric:: Footnotes
.. [#] This module originally conformed to 822, hence the name. Since then,
2822 has been released as an update to 822. This module should be
considered 2822\ -conformant, especially in cases where the syntax or
semantics have changed since 822.
==============================================================================
*py2stdlib-rlcompleter*
rlcompleter~
:synopsis: Python identifier completion, suitable for the GNU readline library.
The rlcompleter (|py2stdlib-rlcompleter|) module defines a completion function suitable for the
readline (|py2stdlib-readline|) module by completing valid Python identifiers and keywords.
When this module is imported on a Unix platform with the readline (|py2stdlib-readline|) module
available, an instance of the Completer class is automatically created
and its complete method is set as the readline (|py2stdlib-readline|) completer.
Example:: >
>>> import rlcompleter
>>> import readline
>>> readline.parse_and_bind("tab: complete")
>>> readline. <TAB PRESSED>
readline.__doc__ readline.get_line_buffer( readline.read_init_file(
readline.__file__ readline.insert_text( readline.set_completer(
readline.__name__ readline.parse_and_bind(
>>> readline.
<
The rlcompleter (|py2stdlib-rlcompleter|) module is designed for use with Python's interactive
mode. A user can add the following lines to his or her initialization file
(identified by the PYTHONSTARTUP environment variable) to get
automatic Tab completion:: >
try:
import readline
except ImportError:
print "Module readline not available."
else:
import rlcompleter
readline.parse_and_bind("tab: complete")
<
On platforms without readline (|py2stdlib-readline|), the Completer class defined by
this module can still be used for custom purposes.
Completer Objects
-----------------
Completer objects have the following method:
Completer.complete(text, state)~
Return the {state}\ th completion for {text}.
If called for {text} that doesn't include a period character (``'.'``), it will
complete from names currently defined in __main__ (|py2stdlib-__main__|), builtin (|py2stdlib-builtin|) and
keywords (as defined by the keyword (|py2stdlib-keyword|) module).
If called for a dotted name, it will try to evaluate anything without obvious
side-effects (functions will not be evaluated, but it can generate calls to
__getattr__) up to the last part, and find matches for the rest via the
dir function. Any exception raised during the evaluation of the
expression is caught, silenced and None is returned.
==============================================================================
*py2stdlib-robotparser*
robotparser~
:synopsis: Loads a robots.txt file and answers questions about
fetchability of other URLs.
.. index::
single: WWW
single: World Wide Web
single: URL
single: robots.txt
.. note::
The robotparser (|py2stdlib-robotparser|) module has been renamed urllib.robotparser in
Python 3.0.
The 2to3 tool will automatically adapt imports when converting
your sources to 3.0.
This module provides a single class, RobotFileParser, which answers
questions about whether or not a particular user agent can fetch a URL on the
Web site that published the robots.txt file. For more details on the
structure of robots.txt files, see http://www.robotstxt.org/orig.html.
RobotFileParser()~
This class provides a set of methods to read, parse and answer questions
about a single robots.txt file.
set_url(url)~
Sets the URL referring to a robots.txt file.
read()~
Reads the robots.txt URL and feeds it to the parser.
parse(lines)~
Parses the lines argument.
can_fetch(useragent, url)~
Returns ``True`` if the {useragent} is allowed to fetch the {url}
according to the rules contained in the parsed robots.txt
file.
mtime()~
Returns the time the ``robots.txt`` file was last fetched. This is
useful for long-running web spiders that need to check for new
``robots.txt`` files periodically.
modified()~
Sets the time the ``robots.txt`` file was last fetched to the current
time.
The following example demonstrates basic use of the RobotFileParser class. :: >
>>> import robotparser
>>> rp = robotparser.RobotFileParser()
>>> rp.set_url("http://www.musi-cal.com/robots.txt")
>>> rp.read()
>>> rp.can_fetch("*", "http://www.musi-cal.com/cgi-bin/search?city=San+Francisco")
False
>>> rp.can_fetch("*", "http://www.musi-cal.com/")
True
==============================================================================
*py2stdlib-runpy*
runpy~
:synopsis: Locate and run Python modules without importing them first.
.. versionadded:: 2.5
The runpy (|py2stdlib-runpy|) module is used to locate and run Python modules without
importing them first. Its main use is to implement the -m command
line switch that allows scripts to be located using the Python module
namespace rather than the filesystem.
The runpy (|py2stdlib-runpy|) module provides two functions:
run_module(mod_name, init_globals=None, run_name=None, alter_sys=False)~
Execute the code of the specified module and return the resulting module
globals dictionary. The module's code is first located using the standard
import mechanism (refer to 302 for details) and then executed in a
fresh module namespace.
If the supplied module name refers to a package rather than a normal
module, then that package is imported and the ``__main__`` submodule within
that package is then executed and the resulting module globals dictionary
returned.
The optional dictionary argument {init_globals} may be used to pre-populate
the module's globals dictionary before the code is executed. The supplied
dictionary will not be modified. If any of the special global variables
below are defined in the supplied dictionary, those definitions are
overridden by run_module.
The special global variables ``__name__``, ``__file__``, ``__loader__``
and ``__package__`` are set in the globals dictionary before the module
code is executed (Note that this is a minimal set of variables - other
variables may be set implicitly as an interpreter implementation detail).
``__name__`` is set to {run_name} if this optional argument is not
None, to ``mod_name + '.__main__'`` if the named module is a
package and to the {mod_name} argument otherwise.
``__file__`` is set to the name provided by the module loader. If the
loader does not make filename information available, this variable is set
to None.
``__loader__`` is set to the 302 module loader used to retrieve the
code for the module (This loader may be a wrapper around the standard
import mechanism).
``__package__`` is set to {mod_name} if the named module is a package and
to ``mod_name.rpartition('.')[0]`` otherwise.
If the argument {alter_sys} is supplied and evaluates to True,
then ``sys.argv[0]`` is updated with the value of ``__file__`` and
``sys.modules[__name__]`` is updated with a temporary module object for the
module being executed. Both ``sys.argv[0]`` and ``sys.modules[__name__]``
are restored to their original values before the function returns.
Note that this manipulation of sys (|py2stdlib-sys|) is not thread-safe. Other threads
may see the partially initialised module, as well as the altered list of
arguments. It is recommended that the sys (|py2stdlib-sys|) module be left alone when
invoking this function from threaded code.
.. versionchanged:: 2.7
Added ability to execute packages by looking for a ``__main__``
submodule
run_path(file_path, init_globals=None, run_name=None)~
Execute the code at the named filesystem location and return the resulting
module globals dictionary. As with a script name supplied to the CPython
command line, the supplied path may refer to a Python source file, a
compiled bytecode file or a valid sys.path entry containing a ``__main__``
module (e.g. a zipfile containing a top-level ``__main__.py`` file).
For a simple script, the specified code is simply executed in a fresh
module namespace. For a valid sys.path entry (typically a zipfile or
directory), the entry is first added to the beginning of ``sys.path``. The
function then looks for and executes a __main__ (|py2stdlib-__main__|) module using the
updated path. Note that there is no special protection against invoking
an existing __main__ (|py2stdlib-__main__|) entry located elsewhere on ``sys.path`` if
there is no such module at the specified location.
The optional dictionary argument {init_globals} may be used to pre-populate
the module's globals dictionary before the code is executed. The supplied
dictionary will not be modified. If any of the special global variables
below are defined in the supplied dictionary, those definitions are
overridden by run_path.
The special global variables ``__name__``, ``__file__``, ``__loader__``
and ``__package__`` are set in the globals dictionary before the module
code is executed (Note that this is a minimal set of variables - other
variables may be set implicitly as an interpreter implementation detail).
``__name__`` is set to {run_name} if this optional argument is not
None and to ``'<run_path>'`` otherwise.
``__file__`` is set to the name provided by the module loader. If the
loader does not make filename information available, this variable is set
to None. For a simple script, this will be set to ``file_path``.
``__loader__`` is set to the 302 module loader used to retrieve the
code for the module (This loader may be a wrapper around the standard
import mechanism). For a simple script, this will be set to None.
``__package__`` is set to ``__name__.rpartition('.')[0]``.
A number of alterations are also made to the sys (|py2stdlib-sys|) module. Firstly,
``sys.path`` may be altered as described above. ``sys.argv[0]`` is updated
with the value of ``file_path`` and ``sys.modules[__name__]`` is updated
with a temporary module object for the module being executed. All
modifications to items in sys (|py2stdlib-sys|) are reverted before the function
returns.
Note that, unlike run_module, the alterations made to sys (|py2stdlib-sys|)
are not optional in this function as these adjustments are essential to
allowing the execution of sys.path entries. As the thread safety
limitations still apply, use of this function in threaded code should be
either serialised with the import lock or delegated to a separate process.
.. versionadded:: 2.7
.. seealso::
338 - Executing modules as scripts
PEP written and implemented by Nick Coghlan.
366 - Main module explicit relative imports
PEP written and implemented by Nick Coghlan.
using-on-general - CPython command line details
==============================================================================
*py2stdlib-sched*
sched~
:synopsis: General purpose event scheduler.
.. index:: single: event scheduling
The sched (|py2stdlib-sched|) module defines a class which implements a general purpose event
scheduler:
scheduler(timefunc, delayfunc)~
The scheduler class defines a generic interface to scheduling events.
It needs two functions to actually deal with the "outside world" --- {timefunc}
should be callable without arguments, and return a number (the "time", in any
units whatsoever). The {delayfunc} function should be callable with one
argument, compatible with the output of {timefunc}, and should delay that many
time units. {delayfunc} will also be called with the argument ``0`` after each
event is run to allow other threads an opportunity to run in multi-threaded
applications.
Example:: >
>>> import sched, time
>>> s = sched.scheduler(time.time, time.sleep)
>>> def print_time(): print "From print_time", time.time()
...
>>> def print_some_times():
... print time.time()
... s.enter(5, 1, print_time, ())
... s.enter(10, 1, print_time, ())
... s.run()
... print time.time()
...
>>> print_some_times()
930343690.257
From print_time 930343695.274
From print_time 930343700.273
930343700.276
<
In multi-threaded environments, the scheduler class has limitations
with respect to thread-safety, inability to insert a new task before
the one currently pending in a running scheduler, and holding up the main
thread until the event queue is empty. Instead, the preferred approach
is to use the threading.Timer class instead.
Example:: >
>>> import time
>>> from threading import Timer
>>> def print_time():
... print "From print_time", time.time()
...
>>> def print_some_times():
... print time.time()
... Timer(5, print_time, ()).start()
... Timer(10, print_time, ()).start()
... time.sleep(11) # sleep while time-delay events execute
... print time.time()
...
>>> print_some_times()
930343690.257
From print_time 930343695.274
From print_time 930343700.273
930343701.301
<
Scheduler Objects
scheduler instances have the following methods and attributes:
scheduler.enterabs(time, priority, action, argument)~
Schedule a new event. The {time} argument should be a numeric type compatible
with the return value of the {timefunc} function passed to the constructor.
Events scheduled for the same {time} will be executed in the order of their
{priority}.
Executing the event means executing ``action({argument)``. }argument* must be a
sequence holding the parameters for {action}.
Return value is an event which may be used for later cancellation of the event
(see cancel).
scheduler.enter(delay, priority, action, argument)~
Schedule an event for {delay} more time units. Other then the relative time, the
other arguments, the effect and the return value are the same as those for
enterabs.
scheduler.cancel(event)~
Remove the event from the queue. If {event} is not an event currently in the
queue, this method will raise a ValueError.
scheduler.empty()~
Return true if the event queue is empty.
scheduler.run()~
Run all scheduled events. This function will wait (using the delayfunc
function passed to the constructor) for the next event, then execute it and so
on until there are no more scheduled events.
Either {action} or {delayfunc} can raise an exception. In either case, the
scheduler will maintain a consistent state and propagate the exception. If an
exception is raised by {action}, the event will not be attempted in future calls
to run.
If a sequence of events takes longer to run than the time available before the
next event, the scheduler will simply fall behind. No events will be dropped;
the calling code is responsible for canceling events which are no longer
pertinent.
scheduler.queue~
Read-only attribute returning a list of upcoming events in the order they
will be run. Each event is shown as a named tuple with the
following fields: time, priority, action, argument.
.. versionadded:: 2.6
==============================================================================
*py2stdlib-scrolledtext*
ScrolledText~
:platform: Tk
:synopsis: Text widget with a vertical scroll bar.
The ScrolledText (|py2stdlib-scrolledtext|) module provides a class of the same name which
implements a basic text widget which has a vertical scroll bar configured to do
the "right thing." Using the ScrolledText (|py2stdlib-scrolledtext|) class is a lot easier than
setting up a text widget and scroll bar directly. The constructor is the same
as that of the Tkinter.Text class.
.. note::
ScrolledText (|py2stdlib-scrolledtext|) has been renamed to tkinter.scrolledtext in Python
3.0. The 2to3 tool will automatically adapt imports when converting
your sources to 3.0.
The text widget and scrollbar are packed together in a Frame, and the
methods of the Grid and Pack geometry managers are acquired
from the Frame object. This allows the ScrolledText (|py2stdlib-scrolledtext|) widget to
be used directly to achieve most normal geometry management behavior.
Should more specific control be necessary, the following attributes are
available:
ScrolledText.frame~
The frame which surrounds the text and scroll bar widgets.
ScrolledText.vbar~
The scroll bar widget.
==============================================================================
*py2stdlib-select*
select~
:synopsis: Wait for I/O completion on multiple streams.
This module provides access to the select (|py2stdlib-select|) and poll functions
available in most operating systems, epoll available on Linux 2.5+ and
kqueue available on most BSD.
Note that on Windows, it only works for sockets; on other operating systems,
it also works for other file types (in particular, on Unix, it works on pipes).
It cannot be used on regular files to determine whether a file has grown since
it was last read.
The module defines the following:
error~
The exception raised when an error occurs. The accompanying value is a pair
containing the numeric error code from errno (|py2stdlib-errno|) and the corresponding
string, as would be printed by the C function perror.
epoll([sizehint=-1])~
(Only supported on Linux 2.5.44 and newer.) Returns an edge polling object,
which can be used as Edge or Level Triggered interface for I/O events; see
section epoll-objects below for the methods supported by epolling
objects.
.. versionadded:: 2.6
poll()~
(Not supported by all operating systems.) Returns a polling object, which
supports registering and unregistering file descriptors, and then polling them
for I/O events; see section poll-objects below for the methods supported
by polling objects.
kqueue()~
(Only supported on BSD.) Returns a kernel queue object object; see section
kqueue-objects below for the methods supported by kqueue objects.
.. versionadded:: 2.6
kevent(ident, filter=KQ_FILTER_READ, flags=KQ_EV_ADD, fflags=0, data=0, udata=0)~
(Only supported on BSD.) Returns a kernel event object object; see section
kevent-objects below for the methods supported by kqueue objects.
.. versionadded:: 2.6
select(rlist, wlist, xlist[, timeout])~
This is a straightforward interface to the Unix select (|py2stdlib-select|) system call.
The first three arguments are sequences of 'waitable objects': either
integers representing file descriptors or objects with a parameterless method
named fileno returning such an integer:
{ }rlist*: wait until ready for reading
{ }wlist*: wait until ready for writing
{ }xlist*: wait for an "exceptional condition" (see the manual page for what
your system considers such a condition)
Empty sequences are allowed, but acceptance of three empty sequences is
platform-dependent. (It is known to work on Unix but not on Windows.) The
optional {timeout} argument specifies a time-out as a floating point number
in seconds. When the {timeout} argument is omitted the function blocks until
at least one file descriptor is ready. A time-out value of zero specifies a
poll and never blocks.
The return value is a triple of lists of objects that are ready: subsets of the
first three arguments. When the time-out is reached without a file descriptor
becoming ready, three empty lists are returned.
.. index::
single: socket() (in module socket)
single: popen() (in module os)
Among the acceptable object types in the sequences are Python file objects (e.g.
``sys.stdin``, or objects returned by open or os.popen), socket
objects returned by socket.socket. You may also define a wrapper
class yourself, as long as it has an appropriate fileno method (that
really returns a file descriptor, not just a random integer).
.. note:: >
.. index:: single: WinSock
File objects on Windows are not acceptable, but sockets are. On Windows,
the underlying select (|py2stdlib-select|) function is provided by the WinSock
library, and does not handle file descriptors that don't originate from
WinSock.
<
select.PIPE_BUF~
Files reported as ready for writing by select (|py2stdlib-select|), poll or
similar interfaces in this module are guaranteed to not block on a write
of up to PIPE_BUF bytes.
This value is guaranteed by POSIX to be at least 512. Availability: Unix.
.. versionadded:: 2.7
Edge and Level Trigger Polling (epoll) Objects
----------------------------------------------
http://linux.die.net/man/4/epoll
{eventmask}
+-----------------------+-----------------------------------------------+
| Constant | Meaning |
+=======================+===============================================+
| EPOLLIN | Available for read |
+-----------------------+-----------------------------------------------+
| EPOLLOUT | Available for write |
+-----------------------+-----------------------------------------------+
| EPOLLPRI | Urgent data for read |
+-----------------------+-----------------------------------------------+
| EPOLLERR | Error condition happened on the assoc. fd |
+-----------------------+-----------------------------------------------+
| EPOLLHUP | Hang up happened on the assoc. fd |
+-----------------------+-----------------------------------------------+
| EPOLLET | Set Edge Trigger behavior, the default is |
| | Level Trigger behavior |
+-----------------------+-----------------------------------------------+
| EPOLLONESHOT | Set one-shot behavior. After one event is |
| | pulled out, the fd is internally disabled |
+-----------------------+-----------------------------------------------+
| EPOLLRDNORM | ??? |
+-----------------------+-----------------------------------------------+
| EPOLLRDBAND | ??? |
+-----------------------+-----------------------------------------------+
| EPOLLWRNORM | ??? |
+-----------------------+-----------------------------------------------+
| EPOLLWRBAND | ??? |
+-----------------------+-----------------------------------------------+
| EPOLLMSG | ??? |
+-----------------------+-----------------------------------------------+
epoll.close()~
Close the control file descriptor of the epoll object.
epoll.fileno()~
Return the file descriptor number of the control fd.
epoll.fromfd(fd)~
Create an epoll object from a given file descriptor.
epoll.register(fd[, eventmask])~
Register a fd descriptor with the epoll object.
.. note:: >
Registering a file descriptor that's already registered raises an
IOError -- contrary to poll-objects's register.
<
epoll.modify(fd, eventmask)~
Modify a register file descriptor.
epoll.unregister(fd)~
Remove a registered file descriptor from the epoll object.
epoll.poll([timeout=-1[, maxevents=-1]])~
Wait for events. timeout in seconds (float)
Polling Objects
---------------
The poll system call, supported on most Unix systems, provides better
scalability for network servers that service many, many clients at the same
time. poll scales better because the system call only requires listing
the file descriptors of interest, while select (|py2stdlib-select|) builds a bitmap, turns
on bits for the fds of interest, and then afterward the whole bitmap has to be
linearly scanned again. select (|py2stdlib-select|) is O(highest file descriptor), while
poll is O(number of file descriptors).
poll.register(fd[, eventmask])~
Register a file descriptor with the polling object. Future calls to the
poll method will then check whether the file descriptor has any pending
I/O events. {fd} can be either an integer, or an object with a fileno
method that returns an integer. File objects implement fileno, so they
can also be used as the argument.
{eventmask} is an optional bitmask describing the type of events you want to
check for, and can be a combination of the constants POLLIN,
POLLPRI, and POLLOUT, described in the table below. If not
specified, the default value used will check for all 3 types of events.
+-------------------+------------------------------------------+
| Constant | Meaning |
+===================+==========================================+
| POLLIN | There is data to read |
+-------------------+------------------------------------------+
| POLLPRI | There is urgent data to read |
+-------------------+------------------------------------------+
| POLLOUT | Ready for output: writing will not block |
+-------------------+------------------------------------------+
| POLLERR | Error condition of some sort |
+-------------------+------------------------------------------+
| POLLHUP | Hung up |
+-------------------+------------------------------------------+
| POLLNVAL | Invalid request: descriptor not open |
+-------------------+------------------------------------------+
Registering a file descriptor that's already registered is not an error, and has
the same effect as registering the descriptor exactly once.
poll.modify(fd, eventmask)~
Modifies an already registered fd. This has the same effect as
register(fd, eventmask). Attempting to modify a file descriptor
that was never registered causes an IOError exception with errno
ENOENT to be raised.
.. versionadded:: 2.6
poll.unregister(fd)~
Remove a file descriptor being tracked by a polling object. Just like the
register method, {fd} can be an integer or an object with a
fileno method that returns an integer.
Attempting to remove a file descriptor that was never registered causes a
KeyError exception to be raised.
poll.poll([timeout])~
Polls the set of registered file descriptors, and returns a possibly-empty list
containing ``(fd, event)`` 2-tuples for the descriptors that have events or
errors to report. {fd} is the file descriptor, and {event} is a bitmask with
bits set for the reported events for that descriptor --- POLLIN for
waiting input, POLLOUT to indicate that the descriptor can be written
to, and so forth. An empty list indicates that the call timed out and no file
descriptors had any events to report. If {timeout} is given, it specifies the
length of time in milliseconds which the system will wait for events before
returning. If {timeout} is omitted, negative, or None, the call will
block until there is an event for this poll object.
Kqueue Objects
--------------
kqueue.close()~
Close the control file descriptor of the kqueue object.
kqueue.fileno()~
Return the file descriptor number of the control fd.
kqueue.fromfd(fd)~
Create a kqueue object from a given file descriptor.
kqueue.control(changelist, max_events[, timeout=None]) -> eventlist~
Low level interface to kevent
- changelist must be an iterable of kevent object or None
- max_events must be 0 or a positive integer
- timeout in seconds (floats possible)
Kevent Objects
--------------
http://www.freebsd.org/cgi/man.cgi?query=kqueue&sektion=2
kevent.ident~
Value used to identify the event. The interpretation depends on the filter
but it's usually the file descriptor. In the constructor ident can either
be an int or an object with a fileno() function. kevent stores the integer
internally.
kevent.filter~
Name of the kernel filter.
+---------------------------+---------------------------------------------+
| Constant | Meaning |
+===========================+=============================================+
| KQ_FILTER_READ | Takes a descriptor and returns whenever |
| | there is data available to read |
+---------------------------+---------------------------------------------+
| KQ_FILTER_WRITE | Takes a descriptor and returns whenever |
| | there is data available to write |
+---------------------------+---------------------------------------------+
| KQ_FILTER_AIO | AIO requests |
+---------------------------+---------------------------------------------+
| KQ_FILTER_VNODE | Returns when one or more of the requested |
| | events watched in {fflag} occurs |
+---------------------------+---------------------------------------------+
| KQ_FILTER_PROC | Watch for events on a process id |
+---------------------------+---------------------------------------------+
| KQ_FILTER_NETDEV | Watch for events on a network device |
| | [not available on Mac OS X] |
+---------------------------+---------------------------------------------+
| KQ_FILTER_SIGNAL | Returns whenever the watched signal is |
| | delivered to the process |
+---------------------------+---------------------------------------------+
| KQ_FILTER_TIMER | Establishes an arbitrary timer |
+---------------------------+---------------------------------------------+
kevent.flags~
Filter action.
+---------------------------+---------------------------------------------+
| Constant | Meaning |
+===========================+=============================================+
| KQ_EV_ADD | Adds or modifies an event |
+---------------------------+---------------------------------------------+
| KQ_EV_DELETE | Removes an event from the queue |
+---------------------------+---------------------------------------------+
| KQ_EV_ENABLE | Permitscontrol() to returns the event |
+---------------------------+---------------------------------------------+
| KQ_EV_DISABLE | Disablesevent |
+---------------------------+---------------------------------------------+
| KQ_EV_ONESHOT | Removes event after first occurrence |
+---------------------------+---------------------------------------------+
| KQ_EV_CLEAR | Reset the state after an event is retrieved |
+---------------------------+---------------------------------------------+
| KQ_EV_SYSFLAGS | internal event |
+---------------------------+---------------------------------------------+
| KQ_EV_FLAG1 | internal event |
+---------------------------+---------------------------------------------+
| KQ_EV_EOF | Filter specific EOF condition |
+---------------------------+---------------------------------------------+
| KQ_EV_ERROR | See return values |
+---------------------------+---------------------------------------------+
kevent.fflags~
Filter specific flags.
KQ_FILTER_READ and KQ_FILTER_WRITE filter flags:
+----------------------------+--------------------------------------------+
| Constant | Meaning |
+============================+============================================+
| KQ_NOTE_LOWAT | low water mark of a socket buffer |
+----------------------------+--------------------------------------------+
KQ_FILTER_VNODE filter flags:
+----------------------------+--------------------------------------------+
| Constant | Meaning |
+============================+============================================+
| KQ_NOTE_DELETE | {unlink()} was called |
+----------------------------+--------------------------------------------+
| KQ_NOTE_WRITE | a write occurred |
+----------------------------+--------------------------------------------+
| KQ_NOTE_EXTEND | the file was extended |
+----------------------------+--------------------------------------------+
| KQ_NOTE_ATTRIB | an attribute was changed |
+----------------------------+--------------------------------------------+
| KQ_NOTE_LINK | the link count has changed |
+----------------------------+--------------------------------------------+
| KQ_NOTE_RENAME | the file was renamed |
+----------------------------+--------------------------------------------+
| KQ_NOTE_REVOKE | access to the file was revoked |
+----------------------------+--------------------------------------------+
KQ_FILTER_PROC filter flags:
+----------------------------+--------------------------------------------+
| Constant | Meaning |
+============================+============================================+
| KQ_NOTE_EXIT | the process has exited |
+----------------------------+--------------------------------------------+
| KQ_NOTE_FORK | the process has called {fork()} |
+----------------------------+--------------------------------------------+
| KQ_NOTE_EXEC | the process has executed a new process |
+----------------------------+--------------------------------------------+
| KQ_NOTE_PCTRLMASK | internal filter flag |
+----------------------------+--------------------------------------------+
| KQ_NOTE_PDATAMASK | internal filter flag |
+----------------------------+--------------------------------------------+
| KQ_NOTE_TRACK | follow a process across {fork()} |
+----------------------------+--------------------------------------------+
| KQ_NOTE_CHILD | returned on the child process for |
| | {NOTE_TRACK} |
+----------------------------+--------------------------------------------+
| KQ_NOTE_TRACKERR | unable to attach to a child |
+----------------------------+--------------------------------------------+
KQ_FILTER_NETDEV filter flags (not available on Mac OS X):
+----------------------------+--------------------------------------------+
| Constant | Meaning |
+============================+============================================+
| KQ_NOTE_LINKUP | link is up |
+----------------------------+--------------------------------------------+
| KQ_NOTE_LINKDOWN | link is down |
+----------------------------+--------------------------------------------+
| KQ_NOTE_LINKINV | link state is invalid |
+----------------------------+--------------------------------------------+
kevent.data~
Filter specific data.
kevent.udata~
User defined value.
==============================================================================
*py2stdlib-sets*
sets~
:synopsis: Implementation of sets of unique elements.
:deprecated:
.. versionadded:: 2.3
2.6~
The built-in ``set``/``frozenset`` types replace this module.
The sets (|py2stdlib-sets|) module provides classes for constructing and manipulating
unordered collections of unique elements. Common uses include membership
testing, removing duplicates from a sequence, and computing standard math
operations on sets such as intersection, union, difference, and symmetric
difference.
Like other collections, sets support ``x in set``, ``len(set)``, and ``for x in
set``. Being an unordered collection, sets do not record element position or
order of insertion. Accordingly, sets do not support indexing, slicing, or
other sequence-like behavior.
Most set applications use the Set class which provides every set method
except for __hash__. For advanced applications requiring a hash method,
the ImmutableSet class adds a __hash__ method but omits methods
which alter the contents of the set. Both Set and ImmutableSet
derive from BaseSet, an abstract class useful for determining whether
something is a set: ``isinstance(obj, BaseSet)``.
The set classes are implemented using dictionaries. Accordingly, the
requirements for set elements are the same as those for dictionary keys; namely,
that the element defines both __eq__ and __hash__. As a result,
sets cannot contain mutable elements such as lists or dictionaries. However,
they can contain immutable collections such as tuples or instances of
ImmutableSet. For convenience in implementing sets of sets, inner sets
are automatically converted to immutable form, for example,
``Set([Set(['dog'])])`` is transformed to ``Set([ImmutableSet(['dog'])])``.
Set([iterable])~
Constructs a new empty Set object. If the optional {iterable}
parameter is supplied, updates the set with elements obtained from iteration.
All of the elements in {iterable} should be immutable or be transformable to an
immutable using the protocol described in section immutable-transforms.
ImmutableSet([iterable])~
Constructs a new empty ImmutableSet object. If the optional {iterable}
parameter is supplied, updates the set with elements obtained from iteration.
All of the elements in {iterable} should be immutable or be transformable to an
immutable using the protocol described in section immutable-transforms.
Because ImmutableSet objects provide a __hash__ method, they
can be used as set elements or as dictionary keys. ImmutableSet
objects do not have methods for adding or removing elements, so all of the
elements must be known when the constructor is called.
Set Objects
-----------
Instances of Set and ImmutableSet both provide the following
operations:
+-------------------------------+------------+---------------------------------+
| Operation | Equivalent | Result |
+===============================+============+=================================+
| ``len(s)`` | | cardinality of set {s} |
+-------------------------------+------------+---------------------------------+
| ``x in s`` | | test {x} for membership in {s} |
+-------------------------------+------------+---------------------------------+
| ``x not in s`` | | test {x} for non-membership in |
| | | {s} |
+-------------------------------+------------+---------------------------------+
| ``s.issubset(t)`` | ``s <= t`` | test whether every element in |
| | | {s} is in {t} |
+-------------------------------+------------+---------------------------------+
| ``s.issuperset(t)`` | ``s >= t`` | test whether every element in |
| | | {t} is in {s} |
+-------------------------------+------------+---------------------------------+
| ``s.union(t)`` | ``s | t`` | new set with elements from both |
| | | {s} and {t} |
+-------------------------------+------------+---------------------------------+
| ``s.intersection(t)`` | ``s & t`` | new set with elements common to |
| | | {s} and {t} |
+-------------------------------+------------+---------------------------------+
| ``s.difference(t)`` | ``s - t`` | new set with elements in {s} |
| | | but not in {t} |
+-------------------------------+------------+---------------------------------+
| ``s.symmetric_difference(t)`` | ``s ^ t`` | new set with elements in either |
| | | {s} or {t} but not both |
+-------------------------------+------------+---------------------------------+
| ``s.copy()`` | | new set with a shallow copy of |
| | | {s} |
+-------------------------------+------------+---------------------------------+
Note, the non-operator versions of union, intersection,
difference, and symmetric_difference will accept any iterable as
an argument. In contrast, their operator based counterparts require their
arguments to be sets. This precludes error-prone constructions like
``Set('abc') & 'cbs'`` in favor of the more readable
``Set('abc').intersection('cbs')``.
.. versionchanged:: 2.3.1
Formerly all arguments were required to be sets.
In addition, both Set and ImmutableSet support set to set
comparisons. Two sets are equal if and only if every element of each set is
contained in the other (each is a subset of the other). A set is less than
another set if and only if the first set is a proper subset of the second set
(is a subset, but is not equal). A set is greater than another set if and only
if the first set is a proper superset of the second set (is a superset, but is
not equal).
The subset and equality comparisons do not generalize to a complete ordering
function. For example, any two disjoint sets are not equal and are not subsets
of each other, so {all} of the following return ``False``: ``a<b``, ``a==b``,
or ``a>b``. Accordingly, sets do not implement the __cmp__ method.
Since sets only define partial ordering (subset relationships), the output of
the list.sort method is undefined for lists of sets.
The following table lists operations available in ImmutableSet but not
found in Set:
+-------------+------------------------------+
| Operation | Result |
+=============+==============================+
| ``hash(s)`` | returns a hash value for {s} |
+-------------+------------------------------+
The following table lists operations available in Set but not found in
ImmutableSet:
+--------------------------------------+-------------+---------------------------------+
| Operation | Equivalent | Result |
+======================================+=============+=================================+
| ``s.update(t)`` | {s} \|= {t} | return set {s} with elements |
| | | added from {t} |
+--------------------------------------+-------------+---------------------------------+
| ``s.intersection_update(t)`` | {s} &= {t} | return set {s} keeping only |
| | | elements also found in {t} |
+--------------------------------------+-------------+---------------------------------+
| ``s.difference_update(t)`` | {s} -= {t} | return set {s} after removing |
| | | elements found in {t} |
+--------------------------------------+-------------+---------------------------------+
| ``s.symmetric_difference_update(t)`` | {s} ^= {t} | return set {s} with elements |
| | | from {s} or {t} but not both |
+--------------------------------------+-------------+---------------------------------+
| ``s.add(x)`` | | add element {x} to set {s} |
+--------------------------------------+-------------+---------------------------------+
| ``s.remove(x)`` | | remove {x} from set {s}; raises |
| | | KeyError if not present |
+--------------------------------------+-------------+---------------------------------+
| ``s.discard(x)`` | | removes {x} from set {s} if |
| | | present |
+--------------------------------------+-------------+---------------------------------+
| ``s.pop()`` | | remove and return an arbitrary |
| | | element from {s}; raises |
| | | KeyError if empty |
+--------------------------------------+-------------+---------------------------------+
| ``s.clear()`` | | remove all elements from set |
| | | {s} |
+--------------------------------------+-------------+---------------------------------+
Note, the non-operator versions of update, intersection_update,
difference_update, and symmetric_difference_update will accept
any iterable as an argument.
.. versionchanged:: 2.3.1
Formerly all arguments were required to be sets.
Also note, the module also includes a union_update method which is an
alias for update. The method is included for backwards compatibility.
Programmers should prefer the update method because it is supported by
the built-in set() and frozenset() types.
Example
-------
>>> from sets import Set
>>> engineers = Set(['John', 'Jane', 'Jack', 'Janice'])
>>> programmers = Set(['Jack', 'Sam', 'Susan', 'Janice'])
>>> managers = Set(['Jane', 'Jack', 'Susan', 'Zack'])
>>> employees = engineers | programmers | managers # union
>>> engineering_management = engineers & managers # intersection
>>> fulltime_management = managers - engineers - programmers # difference
>>> engineers.add('Marvin') # add element
>>> print engineers # doctest: +SKIP
Set(['Jane', 'Marvin', 'Janice', 'John', 'Jack'])
>>> employees.issuperset(engineers) # superset test
False
>>> employees.update(engineers) # update from another set
>>> employees.issuperset(engineers)
True
>>> for group in [engineers, programmers, managers, employees]: # doctest: +SKIP
... group.discard('Susan') # unconditionally remove element
... print group
...
Set(['Jane', 'Marvin', 'Janice', 'John', 'Jack'])
Set(['Janice', 'Jack', 'Sam'])
Set(['Jane', 'Zack', 'Jack'])
Set(['Jack', 'Sam', 'Jane', 'Marvin', 'Janice', 'John', 'Zack'])
Protocol for automatic conversion to immutable
----------------------------------------------
Sets can only contain immutable elements. For convenience, mutable Set
objects are automatically copied to an ImmutableSet before being added
as a set element.
The mechanism is to always add a hashable element, or if it is not
hashable, the element is checked to see if it has an __as_immutable__
method which returns an immutable equivalent.
Since Set objects have a __as_immutable__ method returning an
instance of ImmutableSet, it is possible to construct sets of sets.
A similar mechanism is needed by the __contains__ and remove
methods which need to hash an element to check for membership in a set. Those
methods check an element for hashability and, if not, check for a
__as_temporarily_immutable__ method which returns the element wrapped by
a class that provides temporary methods for __hash__, __eq__,
and __ne__.
The alternate mechanism spares the need to build a separate copy of the original
mutable object.
Set objects implement the __as_temporarily_immutable__ method
which returns the Set object wrapped by a new class
_TemporarilyImmutableSet.
The two mechanisms for adding hashability are normally invisible to the user;
however, a conflict can arise in a multi-threaded environment where one thread
is updating a set while another has temporarily wrapped it in
_TemporarilyImmutableSet. In other words, sets of mutable sets are not
thread-safe.
Comparison to the built-in set types
---------------------------------------------
The built-in set and frozenset types were designed based on
lessons learned from the sets (|py2stdlib-sets|) module. The key differences are:
* Set and ImmutableSet were renamed to set and
frozenset.
* There is no equivalent to BaseSet. Instead, use ``isinstance(x,
(set, frozenset))``.
* The hash algorithm for the built-ins performs significantly better (fewer
collisions) for most datasets.
* The built-in versions have more space efficient pickles.
* The built-in versions do not have a union_update method. Instead, use
the update method which is equivalent.
* The built-in versions do not have a ``_repr(sorted=True)`` method.
Instead, use the built-in repr (|py2stdlib-repr|) and sorted functions:
``repr(sorted(s))``.
* The built-in version does not have a protocol for automatic conversion to
immutable. Many found this feature to be confusing and no one in the community
reported having found real uses for it.
==============================================================================
*py2stdlib-sgmllib*
sgmllib~
:synopsis: Only as much of an SGML parser as needed to parse HTML.
:deprecated:
2.6~
The sgmllib (|py2stdlib-sgmllib|) module has been removed in Python 3.0.
.. index:: single: SGML
This module defines a class SGMLParser which serves as the basis for
parsing text files formatted in SGML (Standard Generalized Mark-up Language).
In fact, it does not provide a full SGML parser --- it only parses SGML insofar
as it is used by HTML, and the module only exists as a base for the
htmllib (|py2stdlib-htmllib|) module. Another HTML parser which supports XHTML and offers a
somewhat different interface is available in the HTMLParser (|py2stdlib-htmlparser|) module.
SGMLParser()~
The SGMLParser class is instantiated without arguments. The parser is
hardcoded to recognize the following constructs:
* Opening and closing tags of the form ``<tag attr="value" ...>`` and
``</tag>``, respectively.
* Numeric character references of the form ``&#name;``.
* Entity references of the form ``&name;``.
* SGML comments of the form ``<!--text-->``. Note that spaces, tabs, and
newlines are allowed between the trailing ``>`` and the immediately preceding
``--``.
A single exception is defined as well:
SGMLParseError~
Exception raised by the SGMLParser class when it encounters an error
while parsing.
.. versionadded:: 2.1
SGMLParser instances have the following methods:
SGMLParser.reset()~
Reset the instance. Loses all unprocessed data. This is called implicitly at
instantiation time.
SGMLParser.setnomoretags()~
Stop processing tags. Treat all following input as literal input (CDATA).
(This is only provided so the HTML tag ``<PLAINTEXT>`` can be implemented.)
SGMLParser.setliteral()~
Enter literal mode (CDATA mode).
SGMLParser.feed(data)~
Feed some text to the parser. It is processed insofar as it consists of
complete elements; incomplete data is buffered until more data is fed or
close is called.
SGMLParser.close()~
Force processing of all buffered data as if it were followed by an end-of-file
mark. This method may be redefined by a derived class to define additional
processing at the end of the input, but the redefined version should always call
close.
SGMLParser.get_starttag_text()~
Return the text of the most recently opened start tag. This should not normally
be needed for structured processing, but may be useful in dealing with HTML "as
deployed" or for re-generating input with minimal changes (whitespace between
attributes can be preserved, etc.).
SGMLParser.handle_starttag(tag, method, attributes)~
This method is called to handle start tags for which either a start_tag
or do_tag method has been defined. The {tag} argument is the name of
the tag converted to lower case, and the {method} argument is the bound method
which should be used to support semantic interpretation of the start tag. The
{attributes} argument is a list of ``(name, value)`` pairs containing the
attributes found inside the tag's ``<>`` brackets.
The {name} has been translated to lower case. Double quotes and backslashes in
the {value} have been interpreted, as well as known character references and
known entity references terminated by a semicolon (normally, entity references
can be terminated by any non-alphanumerical character, but this would break the
very common case of ``<A HREF="url?spam=1&eggs=2">`` when ``eggs`` is a valid
entity name).
For instance, for the tag ``<A HREF="http://www.cwi.nl/">``, this method would
be called as ``unknown_starttag('a', [('href', 'http://www.cwi.nl/')])``. The
base implementation simply calls {method} with {attributes} as the only
argument.
.. versionadded:: 2.5
Handling of entity and character references within attribute values.
SGMLParser.handle_endtag(tag, method)~
This method is called to handle endtags for which an end_tag method has
been defined. The {tag} argument is the name of the tag converted to lower
case, and the {method} argument is the bound method which should be used to
support semantic interpretation of the end tag. If no end_tag method is
defined for the closing element, this handler is not called. The base
implementation simply calls {method}.
SGMLParser.handle_data(data)~
This method is called to process arbitrary data. It is intended to be
overridden by a derived class; the base class implementation does nothing.
SGMLParser.handle_charref(ref)~
This method is called to process a character reference of the form ``&#ref;``.
The base implementation uses convert_charref to convert the reference to
a string. If that method returns a string, it is passed to handle_data,
otherwise ``unknown_charref(ref)`` is called to handle the error.
.. versionchanged:: 2.5
Use convert_charref instead of hard-coding the conversion.
SGMLParser.convert_charref(ref)~
Convert a character reference to a string, or ``None``. {ref} is the reference
passed in as a string. In the base implementation, {ref} must be a decimal
number in the range 0-255. It converts the code point found using the
convert_codepoint method. If {ref} is invalid or out of range, this
method returns ``None``. This method is called by the default
handle_charref implementation and by the attribute value parser.
.. versionadded:: 2.5
SGMLParser.convert_codepoint(codepoint)~
Convert a codepoint to a str value. Encodings can be handled here if
appropriate, though the rest of sgmllib (|py2stdlib-sgmllib|) is oblivious on this matter.
.. versionadded:: 2.5
SGMLParser.handle_entityref(ref)~
This method is called to process a general entity reference of the form
``&ref;`` where {ref} is an general entity reference. It converts {ref} by
passing it to convert_entityref. If a translation is returned, it calls
the method handle_data with the translation; otherwise, it calls the
method ``unknown_entityref(ref)``. The default entitydefs defines
translations for ``&amp;``, ``&apos``, ``&gt;``, ``&lt;``, and ``&quot;``.
.. versionchanged:: 2.5
Use convert_entityref instead of hard-coding the conversion.
SGMLParser.convert_entityref(ref)~
Convert a named entity reference to a str value, or ``None``. The
resulting value will not be parsed. {ref} will be only the name of the entity.
The default implementation looks for {ref} in the instance (or class) variable
entitydefs which should be a mapping from entity names to corresponding
translations. If no translation is available for {ref}, this method returns
``None``. This method is called by the default handle_entityref
implementation and by the attribute value parser.
.. versionadded:: 2.5
SGMLParser.handle_comment(comment)~
This method is called when a comment is encountered. The {comment} argument is
a string containing the text between the ``<!--`` and ``-->`` delimiters, but
not the delimiters themselves. For example, the comment ``<!--text-->`` will
cause this method to be called with the argument ``'text'``. The default method
does nothing.
SGMLParser.handle_decl(data)~
Method called when an SGML declaration is read by the parser. In practice, the
``DOCTYPE`` declaration is the only thing observed in HTML, but the parser does
not discriminate among different (or broken) declarations. Internal subsets in
a ``DOCTYPE`` declaration are not supported. The {data} parameter will be the
entire contents of the declaration inside the ``<!``...\ ``>`` markup. The
default implementation does nothing.
SGMLParser.report_unbalanced(tag)~
This method is called when an end tag is found which does not correspond to any
open element.
SGMLParser.unknown_starttag(tag, attributes)~
This method is called to process an unknown start tag. It is intended to be
overridden by a derived class; the base class implementation does nothing.
SGMLParser.unknown_endtag(tag)~
This method is called to process an unknown end tag. It is intended to be
overridden by a derived class; the base class implementation does nothing.
SGMLParser.unknown_charref(ref)~
This method is called to process unresolvable numeric character references.
Refer to handle_charref to determine what is handled by default. It is
intended to be overridden by a derived class; the base class implementation does
nothing.
SGMLParser.unknown_entityref(ref)~
This method is called to process an unknown entity reference. It is intended to
be overridden by a derived class; the base class implementation does nothing.
Apart from overriding or extending the methods listed above, derived classes may
also define methods of the following form to define processing of specific tags.
Tag names in the input stream are case independent; the {tag} occurring in
method names must be in lower case:
SGMLParser.start_tag(attributes)~
This method is called to process an opening tag {tag}. It has preference over
do_tag. The {attributes} argument has the same meaning as described for
handle_starttag above.
SGMLParser.do_tag(attributes)~
This method is called to process an opening tag {tag} for which no
start_tag method is defined. The {attributes} argument has the same
meaning as described for handle_starttag above.
SGMLParser.end_tag()~
This method is called to process a closing tag {tag}.
Note that the parser maintains a stack of open elements for which no end tag has
been found yet. Only tags processed by start_tag are pushed on this
stack. Definition of an end_tag method is optional for these tags. For
tags processed by do_tag or by unknown_tag, no end_tag
method must be defined; if defined, it will not be used. If both
start_tag and do_tag methods exist for a tag, the
start_tag method takes precedence.
==============================================================================
*py2stdlib-sha*
sha~
:synopsis: NIST's secure hash algorithm, SHA.
:deprecated:
2.5~
Use the hashlib (|py2stdlib-hashlib|) module instead.
.. index::
single: NIST
single: Secure Hash Algorithm
single: checksum; SHA
This module implements the interface to NIST's secure hash algorithm, known as
SHA-1. SHA-1 is an improved version of the original SHA hash algorithm. It is
used in the same way as the md5 (|py2stdlib-md5|) module: use new (|py2stdlib-new|) to create an sha
object, then feed this object with arbitrary strings using the update
method, and at any point you can ask it for the digest of the
concatenation of the strings fed to it so far. SHA-1 digests are 160 bits
instead of MD5's 128 bits.
new([string])~
Return a new sha object. If {string} is present, the method call
``update(string)`` is made.
The following values are provided as constants in the module and as attributes
of the sha objects returned by new (|py2stdlib-new|):
blocksize~
Size of the blocks fed into the hash function; this is always ``1``. This size
is used to allow an arbitrary string to be hashed.
digest_size~
The size of the resulting digest in bytes. This is always ``20``.
An sha object has the same methods as md5 objects:
sha.update(arg)~
Update the sha object with the string {arg}. Repeated calls are equivalent to a
single call with the concatenation of all the arguments: ``m.update(a);
m.update(b)`` is equivalent to ``m.update(a+b)``.
sha.digest()~
Return the digest of the strings passed to the update method so far.
This is a 20-byte string which may contain non-ASCII characters, including null
bytes.
sha.hexdigest()~
Like digest except the digest is returned as a string of length 40,
containing only hexadecimal digits. This may be used to exchange the value
safely in email or other non-binary environments.
sha.copy()~
Return a copy ("clone") of the sha object. This can be used to efficiently
compute the digests of strings that share a common initial substring.
.. seealso::
`Secure Hash Standard <http://csrc.nist.gov/publications/fips/fips180-2/fips180-2withchangenotice.pdf>`_
The Secure Hash Algorithm is defined by NIST document FIPS PUB 180-2: `Secure
Hash Standard
<http://csrc.nist.gov/publications/fips/fips180-2/fips180-2withchangenotice.pdf>`_,
published in August 2002.
`Cryptographic Toolkit (Secure Hashing) <http://csrc.nist.gov/CryptoToolkit/tkhash.html>`_
Links from NIST to various information on secure hashing.
==============================================================================
*py2stdlib-shelve*
shelve~
:synopsis: Python object persistence.
.. index:: module: pickle
A "shelf" is a persistent, dictionary-like object. The difference with "dbm"
databases is that the values (not the keys!) in a shelf can be essentially
arbitrary Python objects --- anything that the pickle (|py2stdlib-pickle|) module can handle.
This includes most class instances, recursive data types, and objects containing
lots of shared sub-objects. The keys are ordinary strings.
open(filename[, flag='c'[, protocol=None[, writeback=False]]])~
Open a persistent dictionary. The filename specified is the base filename for
the underlying database. As a side-effect, an extension may be added to the
filename and more than one file may be created. By default, the underlying
database file is opened for reading and writing. The optional {flag} parameter
has the same interpretation as the {flag} parameter of anydbm.open.
By default, version 0 pickles are used to serialize values. The version of the
pickle protocol can be specified with the {protocol} parameter.
.. versionchanged:: 2.3
The {protocol} parameter was added.
Because of Python semantics, a shelf cannot know when a mutable
persistent-dictionary entry is modified. By default modified objects are
written {only} when assigned to the shelf (see shelve-example). If the
optional {writeback} parameter is set to {True}, all entries accessed are also
cached in memory, and written back on Shelf.sync and
Shelf.close; this can make it handier to mutate mutable entries in
the persistent dictionary, but, if many entries are accessed, it can consume
vast amounts of memory for the cache, and it can make the close operation
very slow since all accessed entries are written back (there is no way to
determine which accessed entries are mutable, nor which ones were actually
mutated).
.. note:: >
Do not rely on the shelf being closed automatically; always call
close explicitly when you don't need it any more, or use a
with statement with contextlib.closing.
<
Shelf objects support all methods supported by dictionaries. This eases the
transition from dictionary based scripts to those requiring persistent storage.
Two additional methods are supported:
Shelf.sync()~
Write back all entries in the cache if the shelf was opened with {writeback}
set to True. Also empty the cache and synchronize the persistent
dictionary on disk, if feasible. This is called automatically when the shelf
is closed with close.
Shelf.close()~
Synchronize and close the persistent {dict} object. Operations on a closed
shelf will fail with a ValueError.
.. seealso::
`Persistent dictionary recipe <http://code.activestate.com/recipes/576642/>`_
with widely supported storage formats and having the speed of native
dictionaries.
Restrictions
------------
.. index::
module: dbm
module: gdbm
module: bsddb
* The choice of which database package will be used (such as dbm (|py2stdlib-dbm|),
gdbm (|py2stdlib-gdbm|) or bsddb (|py2stdlib-bsddb|)) depends on which interface is available. Therefore
it is not safe to open the database directly using dbm (|py2stdlib-dbm|). The database is
also (unfortunately) subject to the limitations of dbm (|py2stdlib-dbm|), if it is used ---
this means that (the pickled representation of) the objects stored in the
database should be fairly small, and in rare cases key collisions may cause the
database to refuse updates.
{ The shelve (|py2stdlib-shelve|) module does not support }concurrent* read/write access to
shelved objects. (Multiple simultaneous read accesses are safe.) When a
program has a shelf open for writing, no other program should have it open for
reading or writing. Unix file locking can be used to solve this, but this
differs across Unix versions and requires knowledge about the database
implementation used.
Shelf(dict[, protocol=None[, writeback=False]])~
A subclass of UserDict.DictMixin which stores pickled values in the
{dict} object.
By default, version 0 pickles are used to serialize values. The version of the
pickle protocol can be specified with the {protocol} parameter. See the
pickle (|py2stdlib-pickle|) documentation for a discussion of the pickle protocols.
.. versionchanged:: 2.3
The {protocol} parameter was added.
If the {writeback} parameter is ``True``, the object will hold a cache of all
entries accessed and write them back to the {dict} at sync and close times.
This allows natural operations on mutable entries, but can consume much more
memory and make sync and close take a long time.
BsdDbShelf(dict[, protocol=None[, writeback=False]])~
A subclass of Shelf which exposes first, !next,
previous, last and set_location which are available in
the bsddb (|py2stdlib-bsddb|) module but not in other database modules. The {dict} object
passed to the constructor must support those methods. This is generally
accomplished by calling one of bsddb.hashopen, bsddb.btopen or
bsddb.rnopen. The optional {protocol} and {writeback} parameters have
the same interpretation as for the Shelf class.
DbfilenameShelf(filename[, flag='c'[, protocol=None[, writeback=False]]])~
A subclass of Shelf which accepts a {filename} instead of a dict-like
object. The underlying file will be opened using anydbm.open. By
default, the file will be created and opened for both read and write. The
optional {flag} parameter has the same interpretation as for the .open
function. The optional {protocol} and {writeback} parameters have the same
interpretation as for the Shelf class.
Example
-------
To summarize the interface (``key`` is a string, ``data`` is an arbitrary
object):: >
import shelve
d = shelve.open(filename) # open -- file may get suffix added by low-level
# library
d[key] = data # store data at key (overwrites old data if
# using an existing key)
data = d[key] # retrieve a COPY of data at key (raise KeyError if no
# such key)
del d[key] # delete data stored at key (raises KeyError
# if no such key)
flag = d.has_key(key) # true if the key exists
klist = d.keys() # a list of all existing keys (slow!)
# as d was opened WITHOUT writeback=True, beware:
d['xx'] = range(4) # this works as expected, but...
d['xx'].append(5) # {this doesn't!} -- d['xx'] is STILL range(4)!
# having opened d without writeback=True, you need to code carefully:
temp = d['xx'] # extracts the copy
temp.append(5) # mutates the copy
d['xx'] = temp # stores the copy right back, to persist it
# or, d=shelve.open(filename,writeback=True) would let you just code
# d['xx'].append(5) and have it work as expected, BUT it would also
# consume more memory and make the d.close() operation slower.
d.close() # close it
<
.. seealso::
Module anydbm (|py2stdlib-anydbm|)
Generic interface to ``dbm``\ -style databases.
Module bsddb (|py2stdlib-bsddb|)
BSD ``db`` database interface.
Module dbhash (|py2stdlib-dbhash|)
Thin layer around the bsddb (|py2stdlib-bsddb|) which provides an dbhash.open
function like the other database modules.
Module dbm (|py2stdlib-dbm|)
Standard Unix database interface.
Module dumbdbm (|py2stdlib-dumbdbm|)
Portable implementation of the ``dbm`` interface.
Module gdbm (|py2stdlib-gdbm|)
GNU database interface, based on the ``dbm`` interface.
Module pickle (|py2stdlib-pickle|)
Object serialization used by shelve (|py2stdlib-shelve|).
Module cPickle (|py2stdlib-cpickle|)
High-performance version of pickle (|py2stdlib-pickle|).
==============================================================================
*py2stdlib-shlex*
shlex~
:synopsis: Simple lexical analysis for Unix shell-like languages.
.. versionadded:: 1.5.2
The shlex (|py2stdlib-shlex|) class makes it easy to write lexical analyzers for simple
syntaxes resembling that of the Unix shell. This will often be useful for
writing minilanguages, (for example, in run control files for Python
applications) or for parsing quoted strings.
.. note::
The shlex (|py2stdlib-shlex|) module currently does not support Unicode input.
The shlex (|py2stdlib-shlex|) module defines the following functions:
split(s[, comments[, posix]])~
Split the string {s} using shell-like syntax. If {comments} is False
(the default), the parsing of comments in the given string will be disabled
(setting the commenters member of the shlex (|py2stdlib-shlex|) instance to the
empty string). This function operates in POSIX mode by default, but uses
non-POSIX mode if the {posix} argument is false.
.. versionadded:: 2.3
.. versionchanged:: 2.6
Added the {posix} parameter.
.. note:: >
Since the split function instantiates a shlex (|py2stdlib-shlex|) instance, passing
``None`` for {s} will read the string to split from standard input.
<
The shlex (|py2stdlib-shlex|) module defines the following class:
shlex([instream[, infile[, posix]]])~
A shlex (|py2stdlib-shlex|) instance or subclass instance is a lexical analyzer object.
The initialization argument, if present, specifies where to read characters
from. It must be a file-/stream-like object with read and
readline (|py2stdlib-readline|) methods, or a string (strings are accepted since Python 2.3).
If no argument is given, input will be taken from ``sys.stdin``. The second
optional argument is a filename string, which sets the initial value of the
infile member. If the {instream} argument is omitted or equal to
``sys.stdin``, this second argument defaults to "stdin". The {posix} argument
was introduced in Python 2.3, and defines the operational mode. When {posix} is
not true (default), the shlex (|py2stdlib-shlex|) instance will operate in compatibility
mode. When operating in POSIX mode, shlex (|py2stdlib-shlex|) will try to be as close as
possible to the POSIX shell parsing rules.
.. seealso::
Module ConfigParser (|py2stdlib-configparser|)
Parser for configuration files similar to the Windows .ini files.
shlex Objects
-------------
A shlex (|py2stdlib-shlex|) instance has the following methods:
shlex.get_token()~
Return a token. If tokens have been stacked using push_token, pop a
token off the stack. Otherwise, read one from the input stream. If reading
encounters an immediate end-of-file, self.eof is returned (the empty
string (``''``) in non-POSIX mode, and ``None`` in POSIX mode).
shlex.push_token(str)~
Push the argument onto the token stack.
shlex.read_token()~
Read a raw token. Ignore the pushback stack, and do not interpret source
requests. (This is not ordinarily a useful entry point, and is documented here
only for the sake of completeness.)
shlex.sourcehook(filename)~
When shlex (|py2stdlib-shlex|) detects a source request (see source below) this
method is given the following token as argument, and expected to return a tuple
consisting of a filename and an open file-like object.
Normally, this method first strips any quotes off the argument. If the result
is an absolute pathname, or there was no previous source request in effect, or
the previous source was a stream (such as ``sys.stdin``), the result is left
alone. Otherwise, if the result is a relative pathname, the directory part of
the name of the file immediately before it on the source inclusion stack is
prepended (this behavior is like the way the C preprocessor handles ``#include
"file.h"``).
The result of the manipulations is treated as a filename, and returned as the
first component of the tuple, with open called on it to yield the second
component. (Note: this is the reverse of the order of arguments in instance
initialization!)
This hook is exposed so that you can use it to implement directory search paths,
addition of file extensions, and other namespace hacks. There is no
corresponding 'close' hook, but a shlex instance will call the close
method of the sourced input stream when it returns EOF.
For more explicit control of source stacking, use the push_source and
pop_source methods.
shlex.push_source(stream[, filename])~
Push an input source stream onto the input stack. If the filename argument is
specified it will later be available for use in error messages. This is the
same method used internally by the sourcehook method.
.. versionadded:: 2.1
shlex.pop_source()~
Pop the last-pushed input source from the input stack. This is the same method
used internally when the lexer reaches EOF on a stacked input stream.
.. versionadded:: 2.1
shlex.error_leader([file[, line]])~
This method generates an error message leader in the format of a Unix C compiler
error label; the format is ``'"%s", line %d: '``, where the ``%s`` is replaced
with the name of the current source file and the ``%d`` with the current input
line number (the optional arguments can be used to override these).
This convenience is provided to encourage shlex (|py2stdlib-shlex|) users to generate error
messages in the standard, parseable format understood by Emacs and other Unix
tools.
Instances of shlex (|py2stdlib-shlex|) subclasses have some public instance variables which
either control lexical analysis or can be used for debugging:
shlex.commenters~
The string of characters that are recognized as comment beginners. All
characters from the comment beginner to end of line are ignored. Includes just
``'#'`` by default.
shlex.wordchars~
The string of characters that will accumulate into multi-character tokens. By
default, includes all ASCII alphanumerics and underscore.
shlex.whitespace~
Characters that will be considered whitespace and skipped. Whitespace bounds
tokens. By default, includes space, tab, linefeed and carriage-return.
shlex.escape~
Characters that will be considered as escape. This will be only used in POSIX
mode, and includes just ``'\'`` by default.
.. versionadded:: 2.3
shlex.quotes~
Characters that will be considered string quotes. The token accumulates until
the same quote is encountered again (thus, different quote types protect each
other as in the shell.) By default, includes ASCII single and double quotes.
shlex.escapedquotes~
Characters in quotes that will interpret escape characters defined in
escape. This is only used in POSIX mode, and includes just ``'"'`` by
default.
.. versionadded:: 2.3
shlex.whitespace_split~
If ``True``, tokens will only be split in whitespaces. This is useful, for
example, for parsing command lines with shlex (|py2stdlib-shlex|), getting tokens in a
similar way to shell arguments.
.. versionadded:: 2.3
shlex.infile~
The name of the current input file, as initially set at class instantiation time
or stacked by later source requests. It may be useful to examine this when
constructing error messages.
shlex.instream~
The input stream from which this shlex (|py2stdlib-shlex|) instance is reading characters.
shlex.source~
This member is ``None`` by default. If you assign a string to it, that string
will be recognized as a lexical-level inclusion request similar to the
``source`` keyword in various shells. That is, the immediately following token
will opened as a filename and input taken from that stream until EOF, at which
point the close method of that stream will be called and the input
source will again become the original input stream. Source requests may be
stacked any number of levels deep.
shlex.debug~
If this member is numeric and ``1`` or more, a shlex (|py2stdlib-shlex|) instance will
print verbose progress output on its behavior. If you need to use this, you can
read the module source code to learn the details.
shlex.lineno~
Source line number (count of newlines seen so far plus one).
shlex.token~
The token buffer. It may be useful to examine this when catching exceptions.
shlex.eof~
Token used to determine end of file. This will be set to the empty string
(``''``), in non-POSIX mode, and to ``None`` in POSIX mode.
.. versionadded:: 2.3
Parsing Rules
-------------
When operating in non-POSIX mode, shlex (|py2stdlib-shlex|) will try to obey to the
following rules.
* Quote characters are not recognized within words (``Do"Not"Separate`` is
parsed as the single word ``Do"Not"Separate``);
* Escape characters are not recognized;
* Enclosing characters in quotes preserve the literal value of all characters
within the quotes;
* Closing quotes separate words (``"Do"Separate`` is parsed as ``"Do"`` and
``Separate``);
* If whitespace_split is ``False``, any character not declared to be a
word character, whitespace, or a quote will be returned as a single-character
token. If it is ``True``, shlex (|py2stdlib-shlex|) will only split words in whitespaces;
* EOF is signaled with an empty string (``''``);
* It's not possible to parse empty strings, even if quoted.
When operating in POSIX mode, shlex (|py2stdlib-shlex|) will try to obey to the following
parsing rules.
* Quotes are stripped out, and do not separate words (``"Do"Not"Separate"`` is
parsed as the single word ``DoNotSeparate``);
* Non-quoted escape characters (e.g. ``'\'``) preserve the literal value of the
next character that follows;
* Enclosing characters in quotes which are not part of escapedquotes
(e.g. ``"'"``) preserve the literal value of all characters within the quotes;
* Enclosing characters in quotes which are part of escapedquotes (e.g.
``'"'``) preserves the literal value of all characters within the quotes, with
the exception of the characters mentioned in escape. The escape
characters retain its special meaning only when followed by the quote in use, or
the escape character itself. Otherwise the escape character will be considered a
normal character.
* EOF is signaled with a None value;
* Quoted empty strings (``''``) are allowed;
==============================================================================
*py2stdlib-shutil*
shutil~
:synopsis: High-level file operations, including copying.
.. partly based on the docstrings
.. index::
single: file; copying
single: copying files
The shutil (|py2stdlib-shutil|) module offers a number of high-level operations on files and
collections of files. In particular, functions are provided which support file
copying and removal. For operations on individual files, see also the
os (|py2stdlib-os|) module.
.. warning::
Even the higher-level file copying functions (copy (|py2stdlib-copy|), copy2)
can't copy all file metadata.
On POSIX platforms, this means that file owner and group are lost as well
as ACLs. On Mac OS, the resource fork and other metadata are not used.
This means that resources will be lost and file type and creator codes will
not be correct. On Windows, file owners, ACLs and alternate data streams
are not copied.
Directory and files operations
------------------------------
copyfileobj(fsrc, fdst[, length])~
Copy the contents of the file-like object {fsrc} to the file-like object {fdst}.
The integer {length}, if given, is the buffer size. In particular, a negative
{length} value means to copy the data without looping over the source data in
chunks; by default the data is read in chunks to avoid uncontrolled memory
consumption. Note that if the current file position of the {fsrc} object is not
0, only the contents from the current file position to the end of the file will
be copied.
copyfile(src, dst)~
Copy the contents (no metadata) of the file named {src} to a file named {dst}.
{dst} must be the complete target file name; look at copy (|py2stdlib-copy|) for a copy that
accepts a target directory path. If {src} and {dst} are the same files,
Error is raised.
The destination location must be writable; otherwise, an IOError exception
will be raised. If {dst} already exists, it will be replaced. Special files
such as character or block devices and pipes cannot be copied with this
function. {src} and {dst} are path names given as strings.
copymode(src, dst)~
Copy the permission bits from {src} to {dst}. The file contents, owner, and
group are unaffected. {src} and {dst} are path names given as strings.
copystat(src, dst)~
Copy the permission bits, last access time, last modification time, and flags
from {src} to {dst}. The file contents, owner, and group are unaffected. {src}
and {dst} are path names given as strings.
copy(src, dst)~
Copy the file {src} to the file or directory {dst}. If {dst} is a directory, a
file with the same basename as {src} is created (or overwritten) in the
directory specified. Permission bits are copied. {src} and {dst} are path
names given as strings.
copy2(src, dst)~
Similar to copy (|py2stdlib-copy|), but metadata is copied as well -- in fact, this is just
copy (|py2stdlib-copy|) followed by copystat. This is similar to the
Unix command cp -p.
ignore_patterns(\*patterns)~
This factory function creates a function that can be used as a callable for
copytree\'s {ignore} argument, ignoring files and directories that
match one of the glob-style {patterns} provided. See the example below.
.. versionadded:: 2.6
copytree(src, dst[, symlinks=False[, ignore=None]])~
Recursively copy an entire directory tree rooted at {src}. The destination
directory, named by {dst}, must not already exist; it will be created as well
as missing parent directories. Permissions and times of directories are
copied with copystat, individual files are copied using
copy2.
If {symlinks} is true, symbolic links in the source tree are represented as
symbolic links in the new tree; if false or omitted, the contents of the
linked files are copied to the new tree.
If {ignore} is given, it must be a callable that will receive as its
arguments the directory being visited by copytree, and a list of its
contents, as returned by os.listdir. Since copytree is
called recursively, the {ignore} callable will be called once for each
directory that is copied. The callable must return a sequence of directory
and file names relative to the current directory (i.e. a subset of the items
in its second argument); these names will then be ignored in the copy
process. ignore_patterns can be used to create such a callable that
ignores names based on glob-style patterns.
If exception(s) occur, an Error is raised with a list of reasons.
The source code for this should be considered an example rather than the
ultimate tool.
.. versionchanged:: 2.3
Error is raised if any exceptions occur during copying, rather than
printing a message.
.. versionchanged:: 2.5
Create intermediate directories needed to create {dst}, rather than raising an
error. Copy permissions and times of directories using copystat.
.. versionchanged:: 2.6
Added the {ignore} argument to be able to influence what is being copied.
rmtree(path[, ignore_errors[, onerror]])~
.. index:: single: directory; deleting
Delete an entire directory tree; {path} must point to a directory (but not a
symbolic link to a directory). If {ignore_errors} is true, errors resulting
from failed removals will be ignored; if false or omitted, such errors are
handled by calling a handler specified by {onerror} or, if that is omitted,
they raise an exception.
If {onerror} is provided, it must be a callable that accepts three
parameters: {function}, {path}, and {excinfo}. The first parameter,
{function}, is the function which raised the exception; it will be
os.path.islink, os.listdir, os.remove or
os.rmdir. The second parameter, {path}, will be the path name passed
to {function}. The third parameter, {excinfo}, will be the exception
information return by sys.exc_info. Exceptions raised by {onerror}
will not be caught.
.. versionchanged:: 2.6
Explicitly check for {path} being a symbolic link and raise OSError
in that case.
move(src, dst)~
Recursively move a file or directory to another location.
If the destination is on the current filesystem, then simply use rename.
Otherwise, copy src (with copy2) to the dst and then remove src.
.. versionadded:: 2.3
Error~
This exception collects exceptions that raised during a multi-file operation. For
copytree, the exception argument is a list of 3-tuples ({srcname},
{dstname}, {exception}).
.. versionadded:: 2.3
copytree example
:::::::::::::::: >
<
This example is the implementation of the copytree function, described
above, with the docstring omitted. It demonstrates many of the other functions
provided by this module. :: >
def copytree(src, dst, symlinks=False, ignore=None):
names = os.listdir(src)
if ignore is not None:
ignored_names = ignore(src, names)
else:
ignored_names = set()
os.makedirs(dst)
errors = []
for name in names:
if name in ignored_names:
continue
srcname = os.path.join(src, name)
dstname = os.path.join(dst, name)
try:
if symlinks and os.path.islink(srcname):
linkto = os.readlink(srcname)
os.symlink(linkto, dstname)
elif os.path.isdir(srcname):
copytree(srcname, dstname, symlinks, ignore)
else:
copy2(srcname, dstname)
# XXX What about devices, sockets etc.?
except (IOError, os.error), why:
errors.append((srcname, dstname, str(why)))
# catch the Error from the recursive copytree so that we can
# continue with other files
except Error, err:
errors.extend(err.args[0])
try:
copystat(src, dst)
except WindowsError:
# can't copy file access times on Windows
pass
except OSError, why:
errors.extend((src, dst, str(why)))
if errors:
raise Error(errors)
<
Another example that uses the ignore_patterns helper::
from shutil import copytree, ignore_patterns
copytree(source, destination, ignore=ignore_patterns('{.pyc', 'tmp}'))
This will copy everything except ``.pyc`` files and files or directories whose
name starts with ``tmp``.
Another example that uses the {ignore} argument to add a logging call:: >
from shutil import copytree
import logging
def _logpath(path, names):
logging.info('Working in %s' % path)
return [] # nothing will be ignored
copytree(source, destination, ignore=_logpath)
<
Archives operations
make_archive(base_name, format, [root_dir, [base_dir, [verbose, [dry_run, [owner, [group, [logger]]]]]]])~
Create an archive file (eg. zip or tar) and returns its name.
{base_name} is the name of the file to create, including the path, minus
any format-specific extension. {format} is the archive format: one of
"zip", "tar", "bztar" or "gztar".
{root_dir} is a directory that will be the root directory of the
archive; ie. we typically chdir into {root_dir} before creating the
archive.
{base_dir} is the directory where we start archiving from;
ie. {base_dir} will be the common prefix of all files and
directories in the archive.
{root_dir} and {base_dir} both default to the current directory.
{owner} and {group} are used when creating a tar archive. By default,
uses the current owner and group.
.. versionadded:: 2.7
get_archive_formats()~
Returns a list of supported formats for archiving.
Each element of the returned sequence is a tuple ``(name, description)``
By default shutil (|py2stdlib-shutil|) provides these formats:
- {gztar}: gzip'ed tar-file
- {bztar}: bzip2'ed tar-file
- {tar}: uncompressed tar file
- {zip}: ZIP file
You can register new formats or provide your own archiver for any existing
formats, by using register_archive_format.
.. versionadded:: 2.7
register_archive_format(name, function, [extra_args, [description]])~
Registers an archiver for the format {name}. {function} is a callable that
will be used to invoke the archiver.
If given, {extra_args} is a sequence of ``(name, value)`` that will be
used as extra keywords arguments when the archiver callable is used.
{description} is used by get_archive_formats which returns the
list of archivers. Defaults to an empty list.
.. versionadded:: 2.7
unregister_archive_format(name)~
Remove the archive format {name} from the list of supported formats.
.. versionadded:: 2.7
Archiving example
::::::::::::::::: >
<
In this example, we create a gzip'ed tar-file archive containing all files
found in the .ssh directory of the user:: >
>>> from shutil import make_archive
>>> import os
>>> archive_name = os.path.expanduser(os.path.join('~', 'myarchive'))
>>> root_dir = os.path.expanduser(os.path.join('~', '.ssh'))
>>> make_archive(archive_name, 'gztar', root_dir)
'/Users/tarek/myarchive.tar.gz'
<
The resulting archive contains::
$ tar -tzvf /Users/tarek/myarchive.tar.gz
drwx------ tarek/staff 0 2010-02-01 16:23:40 ./
-rw-r--r-- tarek/staff 609 2008-06-09 13:26:54 ./authorized_keys
-rwxr-xr-x tarek/staff 65 2008-06-09 13:26:54 ./config
-rwx------ tarek/staff 668 2008-06-09 13:26:54 ./id_dsa
-rwxr-xr-x tarek/staff 609 2008-06-09 13:26:54 ./id_dsa.pub
-rw------- tarek/staff 1675 2008-06-09 13:26:54 ./id_rsa
-rw-r--r-- tarek/staff 397 2008-06-09 13:26:54 ./id_rsa.pub
-rw-r--r-- tarek/staff 37192 2010-02-06 18:23:10 ./known_hosts
==============================================================================
*py2stdlib-signal*
signal~
:synopsis: Set handlers for asynchronous events.
This module provides mechanisms to use signal handlers in Python. Some general
rules for working with signals and their handlers:
* A handler for a particular signal, once set, remains installed until it is
explicitly reset (Python emulates the BSD style interface regardless of the
underlying implementation), with the exception of the handler for
SIGCHLD, which follows the underlying implementation.
* There is no way to "block" signals temporarily from critical sections (since
this is not supported by all Unix flavors).
* Although Python signal handlers are called asynchronously as far as the Python
user is concerned, they can only occur between the "atomic" instructions of the
Python interpreter. This means that signals arriving during long calculations
implemented purely in C (such as regular expression matches on large bodies of
text) may be delayed for an arbitrary amount of time.
* When a signal arrives during an I/O operation, it is possible that the I/O
operation raises an exception after the signal handler returns. This is
dependent on the underlying Unix system's semantics regarding interrupted system
calls.
* Because the C signal handler always returns, it makes little sense to catch
synchronous errors like SIGFPE or SIGSEGV.
* Python installs a small number of signal handlers by default: SIGPIPE
is ignored (so write errors on pipes and sockets can be reported as ordinary
Python exceptions) and SIGINT is translated into a
KeyboardInterrupt exception. All of these can be overridden.
* Some care must be taken if both signals and threads are used in the same
program. The fundamental thing to remember in using signals and threads
simultaneously is: always perform signal (|py2stdlib-signal|) operations in the main thread
of execution. Any thread can perform an alarm, getsignal,
pause, setitimer or getitimer; only the main thread
can set a new signal handler, and the main thread will be the only one to
receive signals (this is enforced by the Python signal (|py2stdlib-signal|) module, even
if the underlying thread implementation supports sending signals to
individual threads). This means that signals can't be used as a means of
inter-thread communication. Use locks instead.
The variables defined in the signal (|py2stdlib-signal|) module are:
SIG_DFL~
This is one of two standard signal handling options; it will simply perform
the default function for the signal. For example, on most systems the
default action for SIGQUIT is to dump core and exit, while the
default action for SIGCHLD is to simply ignore it.
SIG_IGN~
This is another standard signal handler, which will simply ignore the given
signal.
SIG*~
All the signal numbers are defined symbolically. For example, the hangup signal
is defined as signal.SIGHUP; the variable names are identical to the
names used in C programs, as found in ``<signal.h>``. The Unix man page for
'signal (|py2stdlib-signal|)' lists the existing signals (on some systems this is
signal(2), on others the list is in signal(7)). Note that
not all systems define the same set of signal names; only those names defined by
the system are defined by this module.
CTRL_C_EVENT~
The signal corresponding to the CTRL+C keystroke event.
Availability: Windows.
.. versionadded:: 2.7
CTRL_BREAK_EVENT~
The signal corresponding to the CTRL+BREAK keystroke event.
Availability: Windows.
.. versionadded:: 2.7
NSIG~
One more than the number of the highest signal number.
ITIMER_REAL~
Decrements interval timer in real time, and delivers SIGALRM upon expiration.
ITIMER_VIRTUAL~
Decrements interval timer only when the process is executing, and delivers
SIGVTALRM upon expiration.
ITIMER_PROF~
Decrements interval timer both when the process executes and when the
system is executing on behalf of the process. Coupled with ITIMER_VIRTUAL,
this timer is usually used to profile the time spent by the application
in user and kernel space. SIGPROF is delivered upon expiration.
The signal (|py2stdlib-signal|) module defines one exception:
ItimerError~
Raised to signal an error from the underlying setitimer or
getitimer implementation. Expect this error if an invalid
interval timer or a negative time is passed to setitimer.
This error is a subtype of IOError.
The signal (|py2stdlib-signal|) module defines the following functions:
alarm(time)~
If {time} is non-zero, this function requests that a SIGALRM signal be
sent to the process in {time} seconds. Any previously scheduled alarm is
canceled (only one alarm can be scheduled at any time). The returned value is
then the number of seconds before any previously set alarm was to have been
delivered. If {time} is zero, no alarm is scheduled, and any scheduled alarm is
canceled. If the return value is zero, no alarm is currently scheduled. (See
the Unix man page alarm(2).) Availability: Unix.
getsignal(signalnum)~
Return the current signal handler for the signal {signalnum}. The returned value
may be a callable Python object, or one of the special values
signal.SIG_IGN, signal.SIG_DFL or None. Here,
signal.SIG_IGN means that the signal was previously ignored,
signal.SIG_DFL means that the default way of handling the signal was
previously in use, and ``None`` means that the previous signal handler was not
installed from Python.
pause()~
Cause the process to sleep until a signal is received; the appropriate handler
will then be called. Returns nothing. Not on Windows. (See the Unix man page
signal(2).)
setitimer(which, seconds[, interval])~
Sets given interval timer (one of signal.ITIMER_REAL,
signal.ITIMER_VIRTUAL or signal.ITIMER_PROF) specified
by {which} to fire after {seconds} (float is accepted, different from
alarm) and after that every {interval} seconds. The interval
timer specified by {which} can be cleared by setting seconds to zero.
When an interval timer fires, a signal is sent to the process.
The signal sent is dependent on the timer being used;
signal.ITIMER_REAL will deliver SIGALRM,
signal.ITIMER_VIRTUAL sends SIGVTALRM,
and signal.ITIMER_PROF will deliver SIGPROF.
The old values are returned as a tuple: (delay, interval).
Attempting to pass an invalid interval timer will cause an
ItimerError. Availability: Unix.
.. versionadded:: 2.6
getitimer(which)~
Returns current value of a given interval timer specified by {which}.
Availability: Unix.
.. versionadded:: 2.6
set_wakeup_fd(fd)~
Set the wakeup fd to {fd}. When a signal is received, a ``'\0'`` byte is
written to the fd. This can be used by a library to wakeup a poll or select
call, allowing the signal to be fully processed.
The old wakeup fd is returned. {fd} must be non-blocking. It is up to the
library to remove any bytes before calling poll or select again.
When threads are enabled, this function can only be called from the main thread;
attempting to call it from other threads will cause a ValueError
exception to be raised.
siginterrupt(signalnum, flag)~
Change system call restart behaviour: if {flag} is False, system
calls will be restarted when interrupted by signal {signalnum}, otherwise
system calls will be interrupted. Returns nothing. Availability: Unix (see
the man page siginterrupt(3) for further information).
Note that installing a signal handler with signal (|py2stdlib-signal|) will reset the
restart behaviour to interruptible by implicitly calling
siginterrupt with a true {flag} value for the given signal.
.. versionadded:: 2.6
signal(signalnum, handler)~
Set the handler for signal {signalnum} to the function {handler}. {handler} can
be a callable Python object taking two arguments (see below), or one of the
special values signal.SIG_IGN or signal.SIG_DFL. The previous
signal handler will be returned (see the description of getsignal
above). (See the Unix man page signal(2).)
When threads are enabled, this function can only be called from the main thread;
attempting to call it from other threads will cause a ValueError
exception to be raised.
The {handler} is called with two arguments: the signal number and the current
stack frame (``None`` or a frame object; for a description of frame objects,
see the description in the type hierarchy <frame-objects> or see the
attribute descriptions in the inspect (|py2stdlib-inspect|) module).
Example
-------
Here is a minimal example program. It uses the alarm function to limit
the time spent waiting to open a file; this is useful if the file is for a
serial device that may not be turned on, which would normally cause the
os.open to hang indefinitely. The solution is to set a 5-second alarm
before opening the file; if the operation takes too long, the alarm signal will
be sent, and the handler raises an exception. :: >
import signal, os
def handler(signum, frame):
print 'Signal handler called with signal', signum
raise IOError("Couldn't open device!")
# Set the signal handler and a 5-second alarm
signal.signal(signal.SIGALRM, handler)
signal.alarm(5)
# This open() may hang indefinitely
fd = os.open('/dev/ttyS0', os.O_RDWR)
signal.alarm(0) # Disable the alarm
==============================================================================
*py2stdlib-simplehttpserver*
SimpleHTTPServer~
:synopsis: This module provides a basic request handler for HTTP servers.
.. note::
The SimpleHTTPServer (|py2stdlib-simplehttpserver|) module has been merged into http.server in
Python 3.0. The 2to3 tool will automatically adapt imports when
converting your sources to 3.0.
The SimpleHTTPServer (|py2stdlib-simplehttpserver|) module defines a single class,
SimpleHTTPRequestHandler, which is interface-compatible with
BaseHTTPServer.BaseHTTPRequestHandler.
The SimpleHTTPServer (|py2stdlib-simplehttpserver|) module defines the following class:
SimpleHTTPRequestHandler(request, client_address, server)~
This class serves files from the current directory and below, directly
mapping the directory structure to HTTP requests.
A lot of the work, such as parsing the request, is done by the base class
BaseHTTPServer.BaseHTTPRequestHandler. This class implements the
do_GET and do_HEAD functions.
The following are defined as class-level attributes of
SimpleHTTPRequestHandler:
server_version~
This will be ``"SimpleHTTP/" + __version__``, where ``__version__`` is
defined at the module level.
extensions_map~
A dictionary mapping suffixes into MIME types. The default is
signified by an empty string, and is considered to be
``application/octet-stream``. The mapping is used case-insensitively,
and so should contain only lower-cased keys.
The SimpleHTTPRequestHandler class defines the following methods:
do_HEAD()~
This method serves the ``'HEAD'`` request type: it sends the headers it
would send for the equivalent ``GET`` request. See the do_GET
method for a more complete explanation of the possible headers.
do_GET()~
The request is mapped to a local file by interpreting the request as a
path relative to the current working directory.
If the request was mapped to a directory, the directory is checked for a
file named ``index.html`` or ``index.htm`` (in that order). If found, the
file's contents are returned; otherwise a directory listing is generated
by calling the list_directory method. This method uses
os.listdir to scan the directory, and returns a ``404`` error
response if the listdir fails.
If the request was mapped to a file, it is opened and the contents are
returned. Any IOError exception in opening the requested file is
mapped to a ``404``, ``'File not found'`` error. Otherwise, the content
type is guessed by calling the guess_type method, which in turn
uses the {extensions_map} variable.
A ``'Content-type:'`` header with the guessed content type is output,
followed by a ``'Content-Length:'`` header with the file's size and a
``'Last-Modified:'`` header with the file's modification time.
Then follows a blank line signifying the end of the headers, and then the
contents of the file are output. If the file's MIME type starts with
``text/`` the file is opened in text mode; otherwise binary mode is used.
The test (|py2stdlib-test|) function in the SimpleHTTPServer (|py2stdlib-simplehttpserver|) module is an
example which creates a server using the SimpleHTTPRequestHandler
as the Handler.
.. versionadded:: 2.5
The ``'Last-Modified'`` header.
The SimpleHTTPServer (|py2stdlib-simplehttpserver|) module can be used in the following manner in order
to set up a very basic web server serving files relative to the current
directory. :: >
import SimpleHTTPServer
import SocketServer
PORT = 8000
Handler = SimpleHTTPServer.SimpleHTTPRequestHandler
httpd = SocketServer.TCPServer(("", PORT), Handler)
print "serving at port", PORT
httpd.serve_forever()
<
The SimpleHTTPServer (|py2stdlib-simplehttpserver|) module can also be invoked directly using the
-m switch of the interpreter with a ``port number`` argument.
Similar to the previous example, this serves the files relative to the
current directory. :: >
python -m SimpleHTTPServer 8000
<
.. seealso::
Module BaseHTTPServer (|py2stdlib-basehttpserver|)
Base class implementation for Web server and request handler.
==============================================================================
*py2stdlib-simplexmlrpcserver*
SimpleXMLRPCServer~
:synopsis: Basic XML-RPC server implementation.
.. note::
The SimpleXMLRPCServer (|py2stdlib-simplexmlrpcserver|) module has been merged into
xmlrpc.server in Python 3.0. The 2to3 tool will automatically
adapt imports when converting your sources to 3.0.
.. versionadded:: 2.2
The SimpleXMLRPCServer (|py2stdlib-simplexmlrpcserver|) module provides a basic server framework for
XML-RPC servers written in Python. Servers can either be free standing, using
SimpleXMLRPCServer (|py2stdlib-simplexmlrpcserver|), or embedded in a CGI environment, using
CGIXMLRPCRequestHandler.
SimpleXMLRPCServer(addr[, requestHandler[, logRequests[, allow_none[, encoding[, bind_and_activate]]]])~
Create a new server instance. This class provides methods for registration of
functions that can be called by the XML-RPC protocol. The {requestHandler}
parameter should be a factory for request handler instances; it defaults to
SimpleXMLRPCRequestHandler. The {addr} and {requestHandler} parameters
are passed to the SocketServer.TCPServer constructor. If {logRequests}
is true (the default), requests will be logged; setting this parameter to false
will turn off logging. The {allow_none} and {encoding} parameters are passed
on to xmlrpclib (|py2stdlib-xmlrpclib|) and control the XML-RPC responses that will be returned
from the server. The {bind_and_activate} parameter controls whether
server_bind and server_activate are called immediately by the
constructor; it defaults to true. Setting it to false allows code to manipulate
the {allow_reuse_address} class variable before the address is bound.
.. versionchanged:: 2.5
The {allow_none} and {encoding} parameters were added.
.. versionchanged:: 2.6
The {bind_and_activate} parameter was added.
CGIXMLRPCRequestHandler([allow_none[, encoding]])~
Create a new instance to handle XML-RPC requests in a CGI environment. The
{allow_none} and {encoding} parameters are passed on to xmlrpclib (|py2stdlib-xmlrpclib|) and
control the XML-RPC responses that will be returned from the server.
.. versionadded:: 2.3
.. versionchanged:: 2.5
The {allow_none} and {encoding} parameters were added.
SimpleXMLRPCRequestHandler()~
Create a new request handler instance. This request handler supports ``POST``
requests and modifies logging so that the {logRequests} parameter to the
SimpleXMLRPCServer (|py2stdlib-simplexmlrpcserver|) constructor parameter is honored.
SimpleXMLRPCServer Objects
--------------------------
The SimpleXMLRPCServer (|py2stdlib-simplexmlrpcserver|) class is based on
SocketServer.TCPServer and provides a means of creating simple, stand
alone XML-RPC servers.
SimpleXMLRPCServer.register_function(function[, name])~
Register a function that can respond to XML-RPC requests. If {name} is given,
it will be the method name associated with {function}, otherwise
``function.__name__`` will be used. {name} can be either a normal or Unicode
string, and may contain characters not legal in Python identifiers, including
the period character.
SimpleXMLRPCServer.register_instance(instance[, allow_dotted_names])~
Register an object which is used to expose method names which have not been
registered using register_function. If {instance} contains a
_dispatch method, it is called with the requested method name and the
parameters from the request. Its API is ``def _dispatch(self, method, params)``
(note that {params} does not represent a variable argument list). If it calls
an underlying function to perform its task, that function is called as
``func(*params)``, expanding the parameter list. The return value from
_dispatch is returned to the client as the result. If {instance} does
not have a _dispatch method, it is searched for an attribute matching
the name of the requested method.
If the optional {allow_dotted_names} argument is true and the instance does not
have a _dispatch method, then if the requested method name contains
periods, each component of the method name is searched for individually, with
the effect that a simple hierarchical search is performed. The value found from
this search is then called with the parameters from the request, and the return
value is passed back to the client.
.. warning:: >
Enabling the {allow_dotted_names} option allows intruders to access your
module's global variables and may allow intruders to execute arbitrary code on
your machine. Only use this option on a secure, closed network.
<
.. versionchanged:: 2.3.5, 2.4.1
{allow_dotted_names} was added to plug a security hole; prior versions are
insecure.
SimpleXMLRPCServer.register_introspection_functions()~
Registers the XML-RPC introspection functions ``system.listMethods``,
``system.methodHelp`` and ``system.methodSignature``.
.. versionadded:: 2.3
SimpleXMLRPCServer.register_multicall_functions()~
Registers the XML-RPC multicall function system.multicall.
SimpleXMLRPCRequestHandler.rpc_paths~
An attribute value that must be a tuple listing valid path portions of the URL
for receiving XML-RPC requests. Requests posted to other paths will result in a
404 "no such page" HTTP error. If this tuple is empty, all paths will be
considered valid. The default value is ``('/', '/RPC2')``.
.. versionadded:: 2.5
SimpleXMLRPCRequestHandler.encode_threshold~
If this attribute is not ``None``, responses larger than this value
will be encoded using the {gzip} transfer encoding, if permitted by
the client. The default is ``1400`` which corresponds roughly
to a single TCP packet.
.. versionadded:: 2.7
SimpleXMLRPCServer Example
^^^^^^^^^^^^^^^^^^^^^^^^^^
Server code:: >
from SimpleXMLRPCServer import SimpleXMLRPCServer
from SimpleXMLRPCServer import SimpleXMLRPCRequestHandler
# Restrict to a particular path.
class RequestHandler(SimpleXMLRPCRequestHandler):
rpc_paths = ('/RPC2',)
# Create server
server = SimpleXMLRPCServer(("localhost", 8000),
requestHandler=RequestHandler)
server.register_introspection_functions()
# Register pow() function; this will use the value of
# pow.__name__ as the name, which is just 'pow'.
server.register_function(pow)
# Register a function under a different name
def adder_function(x,y):
return x + y
server.register_function(adder_function, 'add')
# Register an instance; all the methods of the instance are
# published as XML-RPC methods (in this case, just 'div').
class MyFuncs:
def div(self, x, y):
return x // y
server.register_instance(MyFuncs())
# Run the server's main loop
server.serve_forever()
<
The following client code will call the methods made available by the preceding
server:: >
import xmlrpclib
s = xmlrpclib.ServerProxy('http://localhost:8000')
print s.pow(2,3) # Returns 2{}3 = 8
print s.add(2,3) # Returns 5
print s.div(5,2) # Returns 5//2 = 2
# Print list of available methods
print s.system.listMethods()
<
CGIXMLRPCRequestHandler
The CGIXMLRPCRequestHandler class can be used to handle XML-RPC
requests sent to Python CGI scripts.
CGIXMLRPCRequestHandler.register_function(function[, name])~
Register a function that can respond to XML-RPC requests. If {name} is given,
it will be the method name associated with function, otherwise
{function.__name__} will be used. {name} can be either a normal or Unicode
string, and may contain characters not legal in Python identifiers, including
the period character.
CGIXMLRPCRequestHandler.register_instance(instance)~
Register an object which is used to expose method names which have not been
registered using register_function. If instance contains a
_dispatch method, it is called with the requested method name and the
parameters from the request; the return value is returned to the client as the
result. If instance does not have a _dispatch method, it is searched
for an attribute matching the name of the requested method; if the requested
method name contains periods, each component of the method name is searched for
individually, with the effect that a simple hierarchical search is performed.
The value found from this search is then called with the parameters from the
request, and the return value is passed back to the client.
CGIXMLRPCRequestHandler.register_introspection_functions()~
Register the XML-RPC introspection functions ``system.listMethods``,
``system.methodHelp`` and ``system.methodSignature``.
CGIXMLRPCRequestHandler.register_multicall_functions()~
Register the XML-RPC multicall function ``system.multicall``.
CGIXMLRPCRequestHandler.handle_request([request_text = None])~
Handle a XML-RPC request. If {request_text} is given, it should be the POST
data provided by the HTTP server, otherwise the contents of stdin will be used.
Example:: >
class MyFuncs:
def div(self, x, y) : return x // y
handler = CGIXMLRPCRequestHandler()
handler.register_function(pow)
handler.register_function(lambda x,y: x+y, 'add')
handler.register_introspection_functions()
handler.register_instance(MyFuncs())
handler.handle_request()
==============================================================================
*py2stdlib-site*
site~
:synopsis: A standard way to reference site-specific modules.
{This module is automatically imported during initialization.}* The automatic
import can be suppressed using the interpreter's -S option.
.. index:: triple: module; search; path
Importing this module will append site-specific paths to the module search path.
.. index::
pair: site-python; directory
pair: site-packages; directory
It starts by constructing up to four directories from a head and a tail part.
For the head part, it uses ``sys.prefix`` and ``sys.exec_prefix``; empty heads
are skipped. For the tail part, it uses the empty string and then
lib/site-packages (on Windows) or
lib/python|version|/site-packages and then lib/site-python (on
Unix and Macintosh). For each of the distinct head-tail combinations, it sees
if it refers to an existing directory, and if so, adds it to ``sys.path`` and
also inspects the newly added path for configuration files.
A path configuration file is a file whose name has the form package.pth
and exists in one of the four directories mentioned above; its contents are
additional items (one per line) to be added to ``sys.path``. Non-existing items
are never added to ``sys.path``, but no check is made that the item refers to a
directory (rather than a file). No item is added to ``sys.path`` more than
once. Blank lines and lines beginning with ``#`` are skipped. Lines starting
with ``import`` (followed by space or tab) are executed.
.. versionchanged:: 2.6
A space or tab is now required after the import keyword.
.. index::
single: package
triple: path; configuration; file
For example, suppose ``sys.prefix`` and ``sys.exec_prefix`` are set to
/usr/local. The Python X.Y library is then installed in
/usr/local/lib/python{X.Y} (where only the first three characters of
``sys.version`` are used to form the installation path name). Suppose this has
a subdirectory /usr/local/lib/python{X.Y}/site-packages with three
subsubdirectories, foo, bar and spam, and two path
configuration files, foo.pth and bar.pth. Assume
foo.pth contains the following:: >
# foo package configuration
foo
bar
bletch
<
and bar.pth contains::
# bar package configuration
bar
Then the following version-specific directories are added to
``sys.path``, in this order:: >
/usr/local/lib/pythonX.Y/site-packages/bar
/usr/local/lib/pythonX.Y/site-packages/foo
<
Note that bletch is omitted because it doesn't exist; the bar
directory precedes the foo directory because bar.pth comes
alphabetically before foo.pth; and spam is omitted because it is
not mentioned in either path configuration file.
.. index:: module: sitecustomize
After these path manipulations, an attempt is made to import a module named
sitecustomize, which can perform arbitrary site-specific customizations.
If this import fails with an ImportError exception, it is silently
ignored.
.. index:: module: sitecustomize
Note that for some non-Unix systems, ``sys.prefix`` and ``sys.exec_prefix`` are
empty, and the path manipulations are skipped; however the import of
sitecustomize is still attempted.
PREFIXES~
A list of prefixes for site package directories
.. versionadded:: 2.6
ENABLE_USER_SITE~
Flag showing the status of the user site directory. True means the
user site directory is enabled and added to sys.path. When the flag
is None the user site directory is disabled for security reasons.
.. versionadded:: 2.6
USER_SITE~
Path to the user site directory for the current Python version or None
.. versionadded:: 2.6
USER_BASE~
Path to the base directory for user site directories
.. versionadded:: 2.6
.. envvar:: PYTHONNOUSERSITE
.. versionadded:: 2.6
.. envvar:: PYTHONUSERBASE
.. versionadded:: 2.6
addsitedir(sitedir, known_paths=None)~
Adds a directory to sys.path and processes its pth files.
getsitepackages()~
Returns a list containing all global site-packages directories
(and possibly site-python).
.. versionadded:: 2.7
getuserbase()~
Returns the "user base" directory path.
The "user base" directory can be used to store data. If the global
variable ``USER_BASE`` is not initialized yet, this function will also set
it.
.. versionadded:: 2.7
getusersitepackages()~
Returns the user-specific site-packages directory path.
If the global variable ``USER_SITE`` is not initialized yet, this
function will also set it.
.. versionadded:: 2.7
.. XXX Update documentation
.. XXX document python -m site --user-base --user-site
==============================================================================
*py2stdlib-smtpd*
smtpd~
:synopsis: A SMTP server implementation in Python.
This module offers several classes to implement SMTP servers. One is a generic
do-nothing implementation, which can be overridden, while the other two offer
specific mail-sending strategies.
SMTPServer Objects
------------------
SMTPServer(localaddr, remoteaddr)~
Create a new SMTPServer object, which binds to local address
{localaddr}. It will treat {remoteaddr} as an upstream SMTP relayer. It
inherits from asyncore.dispatcher, and so will insert itself into
asyncore (|py2stdlib-asyncore|)'s event loop on instantiation.
process_message(peer, mailfrom, rcpttos, data)~
Raise NotImplementedError exception. Override this in subclasses to
do something useful with this message. Whatever was passed in the
constructor as {remoteaddr} will be available as the _remoteaddr
attribute. {peer} is the remote host's address, {mailfrom} is the envelope
originator, {rcpttos} are the envelope recipients and {data} is a string
containing the contents of the e-mail (which should be in 2822
format).
DebuggingServer Objects
-----------------------
DebuggingServer(localaddr, remoteaddr)~
Create a new debugging server. Arguments are as per SMTPServer.
Messages will be discarded, and printed on stdout.
PureProxy Objects
-----------------
PureProxy(localaddr, remoteaddr)~
Create a new pure proxy server. Arguments are as per SMTPServer.
Everything will be relayed to {remoteaddr}. Note that running this has a good
chance to make you into an open relay, so please be careful.
MailmanProxy Objects
--------------------
MailmanProxy(localaddr, remoteaddr)~
Create a new pure proxy server. Arguments are as per SMTPServer.
Everything will be relayed to {remoteaddr}, unless local mailman configurations
knows about an address, in which case it will be handled via mailman. Note that
running this has a good chance to make you into an open relay, so please be
careful.
==============================================================================
*py2stdlib-smtplib*
smtplib~
:synopsis: SMTP protocol client (requires sockets).
.. index::
pair: SMTP; protocol
single: Simple Mail Transfer Protocol
The smtplib (|py2stdlib-smtplib|) module defines an SMTP client session object that can be used
to send mail to any Internet machine with an SMTP or ESMTP listener daemon. For
details of SMTP and ESMTP operation, consult 821 (Simple Mail Transfer
Protocol) and 1869 (SMTP Service Extensions).
SMTP([host[, port[, local_hostname[, timeout]]]])~
A SMTP instance encapsulates an SMTP connection. It has methods
that support a full repertoire of SMTP and ESMTP operations. If the optional
host and port parameters are given, the SMTP connect method is called
with those parameters during initialization. An SMTPConnectError is
raised if the specified host doesn't respond correctly. The optional
{timeout} parameter specifies a timeout in seconds for blocking operations
like the connection attempt (if not specified, the global default timeout
setting will be used).
For normal use, you should only require the initialization/connect,
sendmail, and quit methods. An example is included below.
.. versionchanged:: 2.6
{timeout} was added.
SMTP_SSL([host[, port[, local_hostname[, keyfile[, certfile[, timeout]]]]]])~
A SMTP_SSL instance behaves exactly the same as instances of
SMTP. SMTP_SSL should be used for situations where SSL is
required from the beginning of the connection and using starttls is
not appropriate. If {host} is not specified, the local host is used. If
{port} is omitted, the standard SMTP-over-SSL port (465) is used. {keyfile}
and {certfile} are also optional, and can contain a PEM formatted private key
and certificate chain file for the SSL connection. The optional {timeout}
parameter specifies a timeout in seconds for blocking operations like the
connection attempt (if not specified, the global default timeout setting
will be used).
.. versionchanged:: 2.6
{timeout} was added.
LMTP([host[, port[, local_hostname]]])~
The LMTP protocol, which is very similar to ESMTP, is heavily based on the
standard SMTP client. It's common to use Unix sockets for LMTP, so our connect
method must support that as well as a regular host:port server. To specify a
Unix socket, you must use an absolute path for {host}, starting with a '/'.
Authentication is supported, using the regular SMTP mechanism. When using a Unix
socket, LMTP generally don't support or require any authentication, but your
mileage might vary.
.. versionadded:: 2.6
A nice selection of exceptions is defined as well:
SMTPException~
Base exception class for all exceptions raised by this module.
SMTPServerDisconnected~
This exception is raised when the server unexpectedly disconnects, or when an
attempt is made to use the SMTP instance before connecting it to a
server.
SMTPResponseException~
Base class for all exceptions that include an SMTP error code. These exceptions
are generated in some instances when the SMTP server returns an error code. The
error code is stored in the smtp_code attribute of the error, and the
smtp_error attribute is set to the error message.
SMTPSenderRefused~
Sender address refused. In addition to the attributes set by on all
SMTPResponseException exceptions, this sets 'sender' to the string that
the SMTP server refused.
SMTPRecipientsRefused~
All recipient addresses refused. The errors for each recipient are accessible
through the attribute recipients, which is a dictionary of exactly the
same sort as SMTP.sendmail returns.
SMTPDataError~
The SMTP server refused to accept the message data.
SMTPConnectError~
Error occurred during establishment of a connection with the server.
SMTPHeloError~
The server refused our ``HELO`` message.
SMTPAuthenticationError~
SMTP authentication went wrong. Most probably the server didn't accept the
username/password combination provided.
.. seealso::
821 - Simple Mail Transfer Protocol
Protocol definition for SMTP. This document covers the model, operating
procedure, and protocol details for SMTP.
1869 - SMTP Service Extensions
Definition of the ESMTP extensions for SMTP. This describes a framework for
extending SMTP with new commands, supporting dynamic discovery of the commands
provided by the server, and defines a few additional commands.
SMTP Objects
------------
An SMTP instance has the following methods:
SMTP.set_debuglevel(level)~
Set the debug output level. A true value for {level} results in debug messages
for connection and for all messages sent to and received from the server.
SMTP.connect([host[, port]])~
Connect to a host on a given port. The defaults are to connect to the local
host at the standard SMTP port (25). If the hostname ends with a colon (``':'``)
followed by a number, that suffix will be stripped off and the number
interpreted as the port number to use. This method is automatically invoked by
the constructor if a host is specified during instantiation.
SMTP.docmd(cmd, [, argstring])~
Send a command {cmd} to the server. The optional argument {argstring} is simply
concatenated to the command, separated by a space.
This returns a 2-tuple composed of a numeric response code and the actual
response line (multiline responses are joined into one long line.)
In normal operation it should not be necessary to call this method explicitly.
It is used to implement other methods and may be useful for testing private
extensions.
If the connection to the server is lost while waiting for the reply,
SMTPServerDisconnected will be raised.
SMTP.helo([hostname])~
Identify yourself to the SMTP server using ``HELO``. The hostname argument
defaults to the fully qualified domain name of the local host.
The message returned by the server is stored as the helo_resp attribute
of the object.
In normal operation it should not be necessary to call this method explicitly.
It will be implicitly called by the sendmail when necessary.
SMTP.ehlo([hostname])~
Identify yourself to an ESMTP server using ``EHLO``. The hostname argument
defaults to the fully qualified domain name of the local host. Examine the
response for ESMTP option and store them for use by has_extn.
Also sets several informational attributes: the message returned by
the server is stored as the ehlo_resp attribute, does_esmtp
is set to true or false depending on whether the server supports ESMTP, and
esmtp_features will be a dictionary containing the names of the
SMTP service extensions this server supports, and their
parameters (if any).
Unless you wish to use has_extn before sending mail, it should not be
necessary to call this method explicitly. It will be implicitly called by
sendmail when necessary.
SMTP.ehlo_or_helo_if_needed()~
This method call ehlo and or helo if there has been no
previous ``EHLO`` or ``HELO`` command this session. It tries ESMTP ``EHLO``
first.
SMTPHeloError
The server didn't reply properly to the ``HELO`` greeting.
.. versionadded:: 2.6
SMTP.has_extn(name)~
Return True if {name} is in the set of SMTP service extensions returned
by the server, False otherwise. Case is ignored.
SMTP.verify(address)~
Check the validity of an address on this server using SMTP ``VRFY``. Returns a
tuple consisting of code 250 and a full 822 address (including human
name) if the user address is valid. Otherwise returns an SMTP error code of 400
or greater and an error string.
.. note:: >
Many sites disable SMTP ``VRFY`` in order to foil spammers.
<
SMTP.login(user, password)~
Log in on an SMTP server that requires authentication. The arguments are the
username and the password to authenticate with. If there has been no previous
``EHLO`` or ``HELO`` command this session, this method tries ESMTP ``EHLO``
first. This method will return normally if the authentication was successful, or
may raise the following exceptions:
SMTPHeloError
The server didn't reply properly to the ``HELO`` greeting.
SMTPAuthenticationError
The server didn't accept the username/password combination.
SMTPException
No suitable authentication method was found.
SMTP.starttls([keyfile[, certfile]])~
Put the SMTP connection in TLS (Transport Layer Security) mode. All SMTP
commands that follow will be encrypted. You should then call ehlo
again.
If {keyfile} and {certfile} are provided, these are passed to the socket (|py2stdlib-socket|)
module's ssl (|py2stdlib-ssl|) function.
If there has been no previous ``EHLO`` or ``HELO`` command this session,
this method tries ESMTP ``EHLO`` first.
.. versionchanged:: 2.6
SMTPHeloError
The server didn't reply properly to the ``HELO`` greeting.
SMTPException
The server does not support the STARTTLS extension.
.. versionchanged:: 2.6
RuntimeError
SSL/TLS support is not available to your Python interpreter.
SMTP.sendmail(from_addr, to_addrs, msg[, mail_options, rcpt_options])~
Send mail. The required arguments are an 822 from-address string, a list
of 822 to-address strings (a bare string will be treated as a list with 1
address), and a message string. The caller may pass a list of ESMTP options
(such as ``8bitmime``) to be used in ``MAIL FROM`` commands as {mail_options}.
ESMTP options (such as ``DSN`` commands) that should be used with all ``RCPT``
commands can be passed as {rcpt_options}. (If you need to use different ESMTP
options to different recipients you have to use the low-level methods such as
mail, rcpt and data to send the message.)
.. note:: >
The {from_addr} and {to_addrs} parameters are used to construct the message
envelope used by the transport agents. The SMTP does not modify the
message headers in any way.
<
If there has been no previous ``EHLO`` or ``HELO`` command this session, this
method tries ESMTP ``EHLO`` first. If the server does ESMTP, message size and
each of the specified options will be passed to it (if the option is in the
feature set the server advertises). If ``EHLO`` fails, ``HELO`` will be tried
and ESMTP options suppressed.
This method will return normally if the mail is accepted for at least one
recipient. Otherwise it will throw an exception. That is, if this method does
not throw an exception, then someone should get your mail. If this method does
not throw an exception, it returns a dictionary, with one entry for each
recipient that was refused. Each entry contains a tuple of the SMTP error code
and the accompanying error message sent by the server.
This method may raise the following exceptions:
SMTPRecipientsRefused
All recipients were refused. Nobody got the mail. The recipients
attribute of the exception object is a dictionary with information about the
refused recipients (like the one returned when at least one recipient was
accepted).
SMTPHeloError
The server didn't reply properly to the ``HELO`` greeting.
SMTPSenderRefused
The server didn't accept the {from_addr}.
SMTPDataError
The server replied with an unexpected error code (other than a refusal of a
recipient).
Unless otherwise noted, the connection will be open even after an exception is
raised.
SMTP.quit()~
Terminate the SMTP session and close the connection. Return the result of
the SMTP ``QUIT`` command.
.. versionchanged:: 2.6
Return a value.
Low-level methods corresponding to the standard SMTP/ESMTP commands ``HELP``,
``RSET``, ``NOOP``, ``MAIL``, ``RCPT``, and ``DATA`` are also supported.
Normally these do not need to be called directly, so they are not documented
here. For details, consult the module code.
SMTP Example
------------
This example prompts the user for addresses needed in the message envelope ('To'
and 'From' addresses), and the message to be delivered. Note that the headers
to be included with the message must be included in the message as entered; this
example doesn't do any processing of the 822 headers. In particular, the
'To' and 'From' addresses must be included in the message headers explicitly. :: >
import smtplib
def prompt(prompt):
return raw_input(prompt).strip()
fromaddr = prompt("From: ")
toaddrs = prompt("To: ").split()
print "Enter message, end with ^D (Unix) or ^Z (Windows):"
# Add the From: and To: headers at the start!
msg = ("From: %s\r\nTo: %s\r\n\r\n"
% (fromaddr, ", ".join(toaddrs)))
while 1:
try:
line = raw_input()
except EOFError:
break
if not line:
break
msg = msg + line
print "Message length is " + repr(len(msg))
server = smtplib.SMTP('localhost')
server.set_debuglevel(1)
server.sendmail(fromaddr, toaddrs, msg)
server.quit()
<
.. note::
In general, you will want to use the email (|py2stdlib-email|) package's features to
construct an email message, which you can then convert to a string and send
via sendmail; see email-examples.
==============================================================================
*py2stdlib-sndhdr*
sndhdr~
:synopsis: Determine type of a sound file.
.. Based on comments in the module source file.
.. index::
single: A-LAW
single: u-LAW
The sndhdr (|py2stdlib-sndhdr|) provides utility functions which attempt to determine the type
of sound data which is in a file. When these functions are able to determine
what type of sound data is stored in a file, they return a tuple ``(type,
sampling_rate, channels, frames, bits_per_sample)``. The value for {type}
indicates the data type and will be one of the strings ``'aifc'``, ``'aiff'``,
``'au'``, ``'hcom'``, ``'sndr'``, ``'sndt'``, ``'voc'``, ``'wav'``, ``'8svx'``,
``'sb'``, ``'ub'``, or ``'ul'``. The {sampling_rate} will be either the actual
value or ``0`` if unknown or difficult to decode. Similarly, {channels} will be
either the number of channels or ``0`` if it cannot be determined or if the
value is difficult to decode. The value for {frames} will be either the number
of frames or ``-1``. The last item in the tuple, {bits_per_sample}, will either
be the sample size in bits or ``'A'`` for A-LAW or ``'U'`` for u-LAW.
what(filename)~
Determines the type of sound data stored in the file {filename} using
whathdr. If it succeeds, returns a tuple as described above, otherwise
``None`` is returned.
whathdr(filename)~
Determines the type of sound data stored in a file based on the file header.
The name of the file is given by {filename}. This function returns a tuple as
described above on success, or ``None``.
==============================================================================
*py2stdlib-socket*
socket~
:synopsis: Low-level networking interface.
This module provides access to the BSD {socket} interface. It is available on
all modern Unix systems, Windows, Mac OS X, BeOS, OS/2, and probably additional
platforms.
.. note::
Some behavior may be platform dependent, since calls are made to the operating
system socket APIs.
For an introduction to socket programming (in C), see the following papers: An
Introductory 4.3BSD Interprocess Communication Tutorial, by Stuart Sechrest and
An Advanced 4.3BSD Interprocess Communication Tutorial, by Samuel J. Leffler et
al, both in the UNIX Programmer's Manual, Supplementary Documents 1 (sections
PS1:7 and PS1:8). The platform-specific reference material for the various
socket-related system calls are also a valuable source of information on the
details of socket semantics. For Unix, refer to the manual pages; for Windows,
see the WinSock (or Winsock 2) specification. For IPv6-ready APIs, readers may
want to refer to 3493 titled Basic Socket Interface Extensions for IPv6.
.. index:: object: socket
The Python interface is a straightforward transliteration of the Unix system
call and library interface for sockets to Python's object-oriented style: the
socket (|py2stdlib-socket|) function returns a socket object whose methods implement
the various socket system calls. Parameter types are somewhat higher-level than
in the C interface: as with read and write operations on Python
files, buffer allocation on receive operations is automatic, and buffer length
is implicit on send operations.
Socket addresses are represented as follows: A single string is used for the
AF_UNIX address family. A pair ``(host, port)`` is used for the
AF_INET address family, where {host} is a string representing either a
hostname in Internet domain notation like ``'daring.cwi.nl'`` or an IPv4 address
like ``'100.50.200.5'``, and {port} is an integral port number. For
AF_INET6 address family, a four-tuple ``(host, port, flowinfo,
scopeid)`` is used, where {flowinfo} and {scopeid} represents ``sin6_flowinfo``
and ``sin6_scope_id`` member in struct sockaddr_in6 in C. For
socket (|py2stdlib-socket|) module methods, {flowinfo} and {scopeid} can be omitted just for
backward compatibility. Note, however, omission of {scopeid} can cause problems
in manipulating scoped IPv6 addresses. Other address families are currently not
supported. The address format required by a particular socket object is
automatically selected based on the address family specified when the socket
object was created.
For IPv4 addresses, two special forms are accepted instead of a host address:
the empty string represents INADDR_ANY, and the string
``'<broadcast>'`` represents INADDR_BROADCAST. The behavior is not
available for IPv6 for backward compatibility, therefore, you may want to avoid
these if you intend to support IPv6 with your Python programs.
If you use a hostname in the {host} portion of IPv4/v6 socket address, the
program may show a nondeterministic behavior, as Python uses the first address
returned from the DNS resolution. The socket address will be resolved
differently into an actual IPv4/v6 address, depending on the results from DNS
resolution and/or the host configuration. For deterministic behavior use a
numeric address in {host} portion.
.. versionadded:: 2.5
AF_NETLINK sockets are represented as pairs ``pid, groups``.
.. versionadded:: 2.6
Linux-only support for TIPC is also available using the AF_TIPC
address family. TIPC is an open, non-IP based networked protocol designed
for use in clustered computer environments. Addresses are represented by a
tuple, and the fields depend on the address type. The general tuple form is
``(addr_type, v1, v2, v3 [, scope])``, where:
- {addr_type} is one of TIPC_ADDR_NAMESEQ, TIPC_ADDR_NAME, or
TIPC_ADDR_ID.
- {scope} is one of TIPC_ZONE_SCOPE, TIPC_CLUSTER_SCOPE, and
TIPC_NODE_SCOPE.
- If {addr_type} is TIPC_ADDR_NAME, then {v1} is the server type, {v2} is
the port identifier, and {v3} should be 0.
If {addr_type} is TIPC_ADDR_NAMESEQ, then {v1} is the server type, {v2}
is the lower port number, and {v3} is the upper port number.
If {addr_type} is TIPC_ADDR_ID, then {v1} is the node, {v2} is the
reference, and {v3} should be set to 0.
All errors raise exceptions. The normal exceptions for invalid argument types
and out-of-memory conditions can be raised; errors related to socket or address
semantics raise the error socket.error.
Non-blocking mode is supported through socket.setblocking. A
generalization of this based on timeouts is supported through
socket.settimeout.
The module socket (|py2stdlib-socket|) exports the following constants and functions:
error~
.. index:: module: errno
This exception is raised for socket-related errors. The accompanying value is
either a string telling what went wrong or a pair ``(errno, string)``
representing an error returned by a system call, similar to the value
accompanying os.error. See the module errno (|py2stdlib-errno|), which contains names
for the error codes defined by the underlying operating system.
.. versionchanged:: 2.6
socket.error is now a child class of IOError.
herror~
This exception is raised for address-related errors, i.e. for functions that use
{h_errno} in the C API, including gethostbyname_ex and
gethostbyaddr.
The accompanying value is a pair ``(h_errno, string)`` representing an error
returned by a library call. {string} represents the description of {h_errno}, as
returned by the hstrerror C function.
gaierror~
This exception is raised for address-related errors, for getaddrinfo and
getnameinfo. The accompanying value is a pair ``(error, string)``
representing an error returned by a library call. {string} represents the
description of {error}, as returned by the gai_strerror C function. The
{error} value will match one of the EAI_\* constants defined in this
module.
timeout~
This exception is raised when a timeout occurs on a socket which has had
timeouts enabled via a prior call to settimeout. The accompanying value
is a string whose value is currently always "timed out".
.. versionadded:: 2.3
AF_UNIX~
AF_INET
AF_INET6
These constants represent the address (and protocol) families, used for the
first argument to socket (|py2stdlib-socket|). If the AF_UNIX constant is not
defined then this protocol is unsupported.
SOCK_STREAM~
SOCK_DGRAM
SOCK_RAW
SOCK_RDM
SOCK_SEQPACKET
These constants represent the socket types, used for the second argument to
socket (|py2stdlib-socket|). (Only SOCK_STREAM and SOCK_DGRAM appear to be
generally useful.)
SO_*~
SOMAXCONN
MSG_*
SOL_*
IPPROTO_*
IPPORT_*
INADDR_*
IP_*
IPV6_*
EAI_*
AI_*
NI_*
TCP_*
Many constants of these forms, documented in the Unix documentation on sockets
and/or the IP protocol, are also defined in the socket module. They are
generally used in arguments to the setsockopt and getsockopt
methods of socket objects. In most cases, only those symbols that are defined
in the Unix header files are defined; for a few symbols, default values are
provided.
SIO_*~
RCVALL_*
Constants for Windows' WSAIoctl(). The constants are used as arguments to the
ioctl method of socket objects.
.. versionadded:: 2.6
TIPC_*~
TIPC related constants, matching the ones exported by the C socket API. See
the TIPC documentation for more information.
.. versionadded:: 2.6
has_ipv6~
This constant contains a boolean value which indicates if IPv6 is supported on
this platform.
.. versionadded:: 2.3
create_connection(address[, timeout[, source_address]])~
Convenience function. Connect to {address} (a 2-tuple ``(host, port)``),
and return the socket object. Passing the optional {timeout} parameter will
set the timeout on the socket instance before attempting to connect. If no
{timeout} is supplied, the global default timeout setting returned by
getdefaulttimeout is used.
If supplied, {source_address} must be a 2-tuple ``(host, port)`` for the
socket to bind to as its source address before connecting. If host or port
are '' or 0 respectively the OS default behavior will be used.
.. versionadded:: 2.6
.. versionchanged:: 2.7
{source_address} was added.
getaddrinfo(host, port, family=0, socktype=0, proto=0, flags=0)~
Translate the {host}/{port} argument into a sequence of 5-tuples that contain
all the necessary arguments for creating a socket connected to that service.
{host} is a domain name, a string representation of an IPv4/v6 address
or ``None``. {port} is a string service name such as ``'http'``, a numeric
port number or ``None``. By passing ``None`` as the value of {host}
and {port}, you can pass ``NULL`` to the underlying C API.
The {family}, {socktype} and {proto} arguments can be optionally specified
in order to narrow the list of addresses returned. Passing zero as a
value for each of these arguments selects the full range of results.
The {flags} argument can be one or several of the ``AI_*`` constants,
and will influence how results are computed and returned.
For example, AI_NUMERICHOST will disable domain name resolution
and will raise an error if {host} is a domain name.
The function returns a list of 5-tuples with the following structure:
``(family, socktype, proto, canonname, sockaddr)``
In these tuples, {family}, {socktype}, {proto} are all integers and are
meant to be passed to the socket (|py2stdlib-socket|) function. {canonname} will be
a string representing the canonical name of the {host} if
AI_CANONNAME is part of the {flags} argument; else {canonname}
will be empty. {sockaddr} is a tuple describing a socket address, whose
format depends on the returned {family} (a ``(address, port)`` 2-tuple for
AF_INET, a ``(address, port, flow info, scope id)`` 4-tuple for
AF_INET6), and is meant to be passed to the socket.connect
method.
The following example fetches address information for a hypothetical TCP
connection to ``www.python.org`` on port 80 (results may differ on your
system if IPv6 isn't enabled):: >
>>> socket.getaddrinfo("www.python.org", 80, 0, 0, socket.SOL_TCP)
[(2, 1, 6, '', ('82.94.164.162', 80)),
(10, 1, 6, '', ('2001:888:2000:d::a2', 80, 0, 0))]
<
.. versionadded:: 2.2
getfqdn([name])~
Return a fully qualified domain name for {name}. If {name} is omitted or empty,
it is interpreted as the local host. To find the fully qualified name, the
hostname returned by gethostbyaddr is checked, followed by aliases for the
host, if available. The first name which includes a period is selected. In
case no fully qualified domain name is available, the hostname as returned by
gethostname is returned.
.. versionadded:: 2.0
gethostbyname(hostname)~
Translate a host name to IPv4 address format. The IPv4 address is returned as a
string, such as ``'100.50.200.5'``. If the host name is an IPv4 address itself
it is returned unchanged. See gethostbyname_ex for a more complete
interface. gethostbyname does not support IPv6 name resolution, and
getaddrinfo should be used instead for IPv4/v6 dual stack support.
gethostbyname_ex(hostname)~
Translate a host name to IPv4 address format, extended interface. Return a
triple ``(hostname, aliaslist, ipaddrlist)`` where {hostname} is the primary
host name responding to the given {ip_address}, {aliaslist} is a (possibly
empty) list of alternative host names for the same address, and {ipaddrlist} is
a list of IPv4 addresses for the same interface on the same host (often but not
always a single address). gethostbyname_ex does not support IPv6 name
resolution, and getaddrinfo should be used instead for IPv4/v6 dual
stack support.
gethostname()~
Return a string containing the hostname of the machine where the Python
interpreter is currently executing.
If you want to know the current machine's IP address, you may want to use
``gethostbyname(gethostname())``. This operation assumes that there is a
valid address-to-host mapping for the host, and the assumption does not
always hold.
Note: gethostname doesn't always return the fully qualified domain
name; use ``getfqdn()`` (see above).
gethostbyaddr(ip_address)~
Return a triple ``(hostname, aliaslist, ipaddrlist)`` where {hostname} is the
primary host name responding to the given {ip_address}, {aliaslist} is a
(possibly empty) list of alternative host names for the same address, and
{ipaddrlist} is a list of IPv4/v6 addresses for the same interface on the same
host (most likely containing only a single address). To find the fully qualified
domain name, use the function getfqdn. gethostbyaddr supports
both IPv4 and IPv6.
getnameinfo(sockaddr, flags)~
Translate a socket address {sockaddr} into a 2-tuple ``(host, port)``. Depending
on the settings of {flags}, the result can contain a fully-qualified domain name
or numeric address representation in {host}. Similarly, {port} can contain a
string port name or a numeric port number.
.. versionadded:: 2.2
getprotobyname(protocolname)~
Translate an Internet protocol name (for example, ``'icmp'``) to a constant
suitable for passing as the (optional) third argument to the socket (|py2stdlib-socket|)
function. This is usually only needed for sockets opened in "raw" mode
(SOCK_RAW); for the normal socket modes, the correct protocol is chosen
automatically if the protocol is omitted or zero.
getservbyname(servicename[, protocolname])~
Translate an Internet service name and protocol name to a port number for that
service. The optional protocol name, if given, should be ``'tcp'`` or
``'udp'``, otherwise any protocol will match.
getservbyport(port[, protocolname])~
Translate an Internet port number and protocol name to a service name for that
service. The optional protocol name, if given, should be ``'tcp'`` or
``'udp'``, otherwise any protocol will match.
socket([family[, type[, proto]]])~
Create a new socket using the given address family, socket type and protocol
number. The address family should be AF_INET (the default),
AF_INET6 or AF_UNIX. The socket type should be
SOCK_STREAM (the default), SOCK_DGRAM or perhaps one of the
other ``SOCK_`` constants. The protocol number is usually zero and may be
omitted in that case.
socketpair([family[, type[, proto]]])~
Build a pair of connected socket objects using the given address family, socket
type, and protocol number. Address family, socket type, and protocol number are
as for the socket (|py2stdlib-socket|) function above. The default family is AF_UNIX
if defined on the platform; otherwise, the default is AF_INET.
Availability: Unix.
.. versionadded:: 2.4
fromfd(fd, family, type[, proto])~
Duplicate the file descriptor {fd} (an integer as returned by a file object's
fileno method) and build a socket object from the result. Address
family, socket type and protocol number are as for the socket (|py2stdlib-socket|) function
above. The file descriptor should refer to a socket, but this is not checked ---
subsequent operations on the object may fail if the file descriptor is invalid.
This function is rarely needed, but can be used to get or set socket options on
a socket passed to a program as standard input or output (such as a server
started by the Unix inet daemon). The socket is assumed to be in blocking mode.
Availability: Unix.
ntohl(x)~
Convert 32-bit positive integers from network to host byte order. On machines
where the host byte order is the same as network byte order, this is a no-op;
otherwise, it performs a 4-byte swap operation.
ntohs(x)~
Convert 16-bit positive integers from network to host byte order. On machines
where the host byte order is the same as network byte order, this is a no-op;
otherwise, it performs a 2-byte swap operation.
htonl(x)~
Convert 32-bit positive integers from host to network byte order. On machines
where the host byte order is the same as network byte order, this is a no-op;
otherwise, it performs a 4-byte swap operation.
htons(x)~
Convert 16-bit positive integers from host to network byte order. On machines
where the host byte order is the same as network byte order, this is a no-op;
otherwise, it performs a 2-byte swap operation.
inet_aton(ip_string)~
Convert an IPv4 address from dotted-quad string format (for example,
'123.45.67.89') to 32-bit packed binary format, as a string four characters in
length. This is useful when conversing with a program that uses the standard C
library and needs objects of type struct in_addr, which is the C type
for the 32-bit packed binary this function returns.
inet_aton also accepts strings with less than three dots; see the
Unix manual page inet(3) for details.
If the IPv4 address string passed to this function is invalid,
socket.error will be raised. Note that exactly what is valid depends on
the underlying C implementation of inet_aton.
inet_aton does not support IPv6, and inet_pton should be used
instead for IPv4/v6 dual stack support.
inet_ntoa(packed_ip)~
Convert a 32-bit packed IPv4 address (a string four characters in length) to its
standard dotted-quad string representation (for example, '123.45.67.89'). This
is useful when conversing with a program that uses the standard C library and
needs objects of type struct in_addr, which is the C type for the
32-bit packed binary data this function takes as an argument.
If the string passed to this function is not exactly 4 bytes in length,
socket.error will be raised. inet_ntoa does not support IPv6, and
inet_ntop should be used instead for IPv4/v6 dual stack support.
inet_pton(address_family, ip_string)~
Convert an IP address from its family-specific string format to a packed, binary
format. inet_pton is useful when a library or network protocol calls for
an object of type struct in_addr (similar to inet_aton) or
struct in6_addr.
Supported values for {address_family} are currently AF_INET and
AF_INET6. If the IP address string {ip_string} is invalid,
socket.error will be raised. Note that exactly what is valid depends on
both the value of {address_family} and the underlying implementation of
inet_pton.
Availability: Unix (maybe not all platforms).
.. versionadded:: 2.3
inet_ntop(address_family, packed_ip)~
Convert a packed IP address (a string of some number of characters) to its
standard, family-specific string representation (for example, ``'7.10.0.5'`` or
``'5aef:2b::8'``) inet_ntop is useful when a library or network protocol
returns an object of type struct in_addr (similar to inet_ntoa)
or struct in6_addr.
Supported values for {address_family} are currently AF_INET and
AF_INET6. If the string {packed_ip} is not the correct length for the
specified address family, ValueError will be raised. A
socket.error is raised for errors from the call to inet_ntop.
Availability: Unix (maybe not all platforms).
.. versionadded:: 2.3
getdefaulttimeout()~
Return the default timeout in floating seconds for new socket objects. A value
of ``None`` indicates that new socket objects have no timeout. When the socket
module is first imported, the default is ``None``.
.. versionadded:: 2.3
setdefaulttimeout(timeout)~
Set the default timeout in floating seconds for new socket objects. A value of
``None`` indicates that new socket objects have no timeout. When the socket
module is first imported, the default is ``None``.
.. versionadded:: 2.3
SocketType~
This is a Python type object that represents the socket object type. It is the
same as ``type(socket(...))``.
.. seealso::
Module SocketServer (|py2stdlib-socketserver|)
Classes that simplify writing network servers.
Socket Objects
--------------
Socket objects have the following methods. Except for makefile these
correspond to Unix system calls applicable to sockets.
socket.accept()~
Accept a connection. The socket must be bound to an address and listening for
connections. The return value is a pair ``(conn, address)`` where {conn} is a
{new} socket object usable to send and receive data on the connection, and
{address} is the address bound to the socket on the other end of the connection.
socket.bind(address)~
Bind the socket to {address}. The socket must not already be bound. (The format
of {address} depends on the address family --- see above.)
.. note:: >
This method has historically accepted a pair of parameters for AF_INET
addresses instead of only a tuple. This was never intentional and is no longer
available in Python 2.0 and later.
<
socket.close()~
Close the socket. All future operations on the socket object will fail. The
remote end will receive no more data (after queued data is flushed). Sockets are
automatically closed when they are garbage-collected.
socket.connect(address)~
Connect to a remote socket at {address}. (The format of {address} depends on the
address family --- see above.)
.. note:: >
This method has historically accepted a pair of parameters for AF_INET
addresses instead of only a tuple. This was never intentional and is no longer
available in Python 2.0 and later.
<
socket.connect_ex(address)~
Like ``connect(address)``, but return an error indicator instead of raising an
exception for errors returned by the C-level connect call (other
problems, such as "host not found," can still raise exceptions). The error
indicator is ``0`` if the operation succeeded, otherwise the value of the
errno (|py2stdlib-errno|) variable. This is useful to support, for example, asynchronous
connects.
.. note:: >
This method has historically accepted a pair of parameters for AF_INET
addresses instead of only a tuple. This was never intentional and is no longer
available in Python 2.0 and later.
<
socket.fileno()~
Return the socket's file descriptor (a small integer). This is useful with
select.select.
Under Windows the small integer returned by this method cannot be used where a
file descriptor can be used (such as os.fdopen). Unix does not have
this limitation.
socket.getpeername()~
Return the remote address to which the socket is connected. This is useful to
find out the port number of a remote IPv4/v6 socket, for instance. (The format
of the address returned depends on the address family --- see above.) On some
systems this function is not supported.
socket.getsockname()~
Return the socket's own address. This is useful to find out the port number of
an IPv4/v6 socket, for instance. (The format of the address returned depends on
the address family --- see above.)
socket.getsockopt(level, optname[, buflen])~
Return the value of the given socket option (see the Unix man page
getsockopt(2)). The needed symbolic constants (SO_\* etc.)
are defined in this module. If {buflen} is absent, an integer option is assumed
and its integer value is returned by the function. If {buflen} is present, it
specifies the maximum length of the buffer used to receive the option in, and
this buffer is returned as a string. It is up to the caller to decode the
contents of the buffer (see the optional built-in module struct (|py2stdlib-struct|) for a way
to decode C structures encoded as strings).
socket.ioctl(control, option)~
:platform: Windows
The ioctl method is a limited interface to the WSAIoctl system
interface. Please refer to the `Win32 documentation
<http://msdn.microsoft.com/en-us/library/ms741621%28VS.85%29.aspx>`_ for more
information.
On other platforms, the generic fcntl.fcntl and fcntl.ioctl
functions may be used; they accept a socket object as their first argument.
.. versionadded:: 2.6
socket.listen(backlog)~
Listen for connections made to the socket. The {backlog} argument specifies the
maximum number of queued connections and should be at least 1; the maximum value
is system-dependent (usually 5).
socket.makefile([mode[, bufsize]])~
.. index:: single: I/O control; buffering
Return a file object associated with the socket. (File objects are
described in bltin-file-objects.) The file object
references a dup\ ped version of the socket file descriptor, so the
file object and socket object may be closed or garbage-collected independently.
The socket must be in blocking mode (it can not have a timeout). The optional
{mode} and {bufsize} arguments are interpreted the same way as by the built-in
file function.
socket.recv(bufsize[, flags])~
Receive data from the socket. The return value is a string representing the
data received. The maximum amount of data to be received at once is specified
by {bufsize}. See the Unix manual page recv(2) for the meaning of
the optional argument {flags}; it defaults to zero.
.. note:: >
For best match with hardware and network realities, the value of {bufsize}
should be a relatively small power of 2, for example, 4096.
<
socket.recvfrom(bufsize[, flags])~
Receive data from the socket. The return value is a pair ``(string, address)``
where {string} is a string representing the data received and {address} is the
address of the socket sending the data. See the Unix manual page
recv(2) for the meaning of the optional argument {flags}; it defaults
to zero. (The format of {address} depends on the address family --- see above.)
socket.recvfrom_into(buffer[, nbytes[, flags]])~
Receive data from the socket, writing it into {buffer} instead of creating a
new string. The return value is a pair ``(nbytes, address)`` where {nbytes} is
the number of bytes received and {address} is the address of the socket sending
the data. See the Unix manual page recv(2) for the meaning of the
optional argument {flags}; it defaults to zero. (The format of {address}
depends on the address family --- see above.)
.. versionadded:: 2.5
socket.recv_into(buffer[, nbytes[, flags]])~
Receive up to {nbytes} bytes from the socket, storing the data into a buffer
rather than creating a new string. If {nbytes} is not specified (or 0),
receive up to the size available in the given buffer. Returns the number of
bytes received. See the Unix manual page recv(2) for the meaning
of the optional argument {flags}; it defaults to zero.
.. versionadded:: 2.5
socket.send(string[, flags])~
Send data to the socket. The socket must be connected to a remote socket. The
optional {flags} argument has the same meaning as for recv above.
Returns the number of bytes sent. Applications are responsible for checking that
all data has been sent; if only some of the data was transmitted, the
application needs to attempt delivery of the remaining data.
socket.sendall(string[, flags])~
Send data to the socket. The socket must be connected to a remote socket. The
optional {flags} argument has the same meaning as for recv above.
Unlike send, this method continues to send data from {string} until
either all data has been sent or an error occurs. ``None`` is returned on
success. On error, an exception is raised, and there is no way to determine how
much data, if any, was successfully sent.
socket.sendto(string[, flags], address)~
Send data to the socket. The socket should not be connected to a remote socket,
since the destination socket is specified by {address}. The optional {flags}
argument has the same meaning as for recv above. Return the number of
bytes sent. (The format of {address} depends on the address family --- see
above.)
socket.setblocking(flag)~
Set blocking or non-blocking mode of the socket: if {flag} is 0, the socket is
set to non-blocking, else to blocking mode. Initially all sockets are in
blocking mode. In non-blocking mode, if a recv call doesn't find any
data, or if a send call can't immediately dispose of the data, a
error exception is raised; in blocking mode, the calls block until they
can proceed. ``s.setblocking(0)`` is equivalent to ``s.settimeout(0.0)``;
``s.setblocking(1)`` is equivalent to ``s.settimeout(None)``.
socket.settimeout(value)~
Set a timeout on blocking socket operations. The {value} argument can be a
nonnegative float expressing seconds, or ``None``. If a float is given,
subsequent socket operations will raise a timeout exception if the
timeout period {value} has elapsed before the operation has completed. Setting
a timeout of ``None`` disables timeouts on socket operations.
``s.settimeout(0.0)`` is equivalent to ``s.setblocking(0)``;
``s.settimeout(None)`` is equivalent to ``s.setblocking(1)``.
.. versionadded:: 2.3
socket.gettimeout()~
Return the timeout in floating seconds associated with socket operations, or
``None`` if no timeout is set. This reflects the last call to
setblocking or settimeout.
.. versionadded:: 2.3
Some notes on socket blocking and timeouts: A socket object can be in one of
three modes: blocking, non-blocking, or timeout. Sockets are always created in
blocking mode. In blocking mode, operations block until complete or
the system returns an error (such as connection timed out). In
non-blocking mode, operations fail (with an error that is unfortunately
system-dependent) if they cannot be completed immediately. In timeout mode,
operations fail if they cannot be completed within the timeout specified for the
socket or if the system returns an error. The socket.setblocking
method is simply a shorthand for certain socket.settimeout calls.
Timeout mode internally sets the socket in non-blocking mode. The blocking and
timeout modes are shared between file descriptors and socket objects that refer
to the same network endpoint. A consequence of this is that file objects
returned by the socket.makefile method must only be used when the
socket is in blocking mode; in timeout or non-blocking mode file operations
that cannot be completed immediately will fail.
Note that the socket.connect operation is subject to the timeout
setting, and in general it is recommended to call socket.settimeout
before calling socket.connect or pass a timeout parameter to
create_connection. The system network stack may return a connection
timeout error of its own regardless of any Python socket timeout setting.
socket.setsockopt(level, optname, value)~
.. index:: module: struct
Set the value of the given socket option (see the Unix manual page
setsockopt(2)). The needed symbolic constants are defined in the
socket (|py2stdlib-socket|) module (SO_\* etc.). The value can be an integer or a
string representing a buffer. In the latter case it is up to the caller to
ensure that the string contains the proper bits (see the optional built-in
module struct (|py2stdlib-struct|) for a way to encode C structures as strings).
socket.shutdown(how)~
Shut down one or both halves of the connection. If {how} is SHUT_RD,
further receives are disallowed. If {how} is SHUT_WR, further sends
are disallowed. If {how} is SHUT_RDWR, further sends and receives are
disallowed.
Note that there are no methods read or write; use
socket.recv and socket.send without {flags} argument instead.
Socket objects also have these (read-only) attributes that correspond to the
values given to the socket (|py2stdlib-socket|) constructor.
socket.family~
The socket family.
.. versionadded:: 2.5
socket.type~
The socket type.
.. versionadded:: 2.5
socket.proto~
The socket protocol.
.. versionadded:: 2.5
Example
-------
Here are four minimal example programs using the TCP/IP protocol: a server that
echoes all data that it receives back (servicing only one client), and a client
using it. Note that a server must perform the sequence socket (|py2stdlib-socket|),
socket.bind, socket.listen, socket.accept (possibly
repeating the socket.accept to service more than one client), while a
client only needs the sequence socket (|py2stdlib-socket|), socket.connect. Also
note that the server does not socket.send/socket.recv on the
socket it is listening on but on the new socket returned by
socket.accept.
The first two examples support IPv4 only. :: >
# Echo server program
import socket
HOST = '' # Symbolic name meaning all available interfaces
PORT = 50007 # Arbitrary non-privileged port
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind((HOST, PORT))
s.listen(1)
conn, addr = s.accept()
print 'Connected by', addr
while 1:
data = conn.recv(1024)
if not data: break
conn.send(data)
conn.close()
<
::
# Echo client program
import socket
HOST = 'daring.cwi.nl' # The remote host
PORT = 50007 # The same port as used by the server
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((HOST, PORT))
s.send('Hello, world')
data = s.recv(1024)
s.close()
print 'Received', repr(data)
The next two examples are identical to the above two, but support both IPv4 and
IPv6. The server side will listen to the first address family available (it
should listen to both instead). On most of IPv6-ready systems, IPv6 will take
precedence and the server may not accept IPv4 traffic. The client side will try
to connect to the all addresses returned as a result of the name resolution, and
sends traffic to the first one connected successfully. :: >
# Echo server program
import socket
import sys
HOST = None # Symbolic name meaning all available interfaces
PORT = 50007 # Arbitrary non-privileged port
s = None
for res in socket.getaddrinfo(HOST, PORT, socket.AF_UNSPEC,
socket.SOCK_STREAM, 0, socket.AI_PASSIVE):
af, socktype, proto, canonname, sa = res
try:
s = socket.socket(af, socktype, proto)
except socket.error, msg:
s = None
continue
try:
s.bind(sa)
s.listen(1)
except socket.error, msg:
s.close()
s = None
continue
break
if s is None:
print 'could not open socket'
sys.exit(1)
conn, addr = s.accept()
print 'Connected by', addr
while 1:
data = conn.recv(1024)
if not data: break
conn.send(data)
conn.close()
<
::
# Echo client program
import socket
import sys
HOST = 'daring.cwi.nl' # The remote host
PORT = 50007 # The same port as used by the server
s = None
for res in socket.getaddrinfo(HOST, PORT, socket.AF_UNSPEC, socket.SOCK_STREAM):
af, socktype, proto, canonname, sa = res
try:
s = socket.socket(af, socktype, proto)
except socket.error, msg:
s = None
continue
try:
s.connect(sa)
except socket.error, msg:
s.close()
s = None
continue
break
if s is None:
print 'could not open socket'
sys.exit(1)
s.send('Hello, world')
data = s.recv(1024)
s.close()
print 'Received', repr(data)
The last example shows how to write a very simple network sniffer with raw
sockets on Windows. The example requires administrator privileges to modify
the interface:: >
import socket
# the public network interface
HOST = socket.gethostbyname(socket.gethostname())
# create a raw socket and bind it to the public interface
s = socket.socket(socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_IP)
s.bind((HOST, 0))
# Include IP headers
s.setsockopt(socket.IPPROTO_IP, socket.IP_HDRINCL, 1)
# receive all packages
s.ioctl(socket.SIO_RCVALL, socket.RCVALL_ON)
# receive a package
print s.recvfrom(65565)
# disabled promiscuous mode
s.ioctl(socket.SIO_RCVALL, socket.RCVALL_OFF)
==============================================================================
*py2stdlib-socketserver*
SocketServer~
:synopsis: A framework for network servers.
.. note::
The SocketServer (|py2stdlib-socketserver|) module has been renamed to socketserver in
Python 3.0. The 2to3 tool will automatically adapt imports when
converting your sources to 3.0.
The SocketServer (|py2stdlib-socketserver|) module simplifies the task of writing network servers.
There are four basic server classes: TCPServer uses the Internet TCP
protocol, which provides for continuous streams of data between the client and
server. UDPServer uses datagrams, which are discrete packets of
information that may arrive out of order or be lost while in transit. The more
infrequently used UnixStreamServer and UnixDatagramServer
classes are similar, but use Unix domain sockets; they're not available on
non-Unix platforms. For more details on network programming, consult a book
such as
W. Richard Steven's UNIX Network Programming or Ralph Davis's Win32 Network
Programming.
These four classes process requests synchronously; each request must be
completed before the next request can be started. This isn't suitable if each
request takes a long time to complete, because it requires a lot of computation,
or because it returns a lot of data which the client is slow to process. The
solution is to create a separate process or thread to handle each request; the
ForkingMixIn and ThreadingMixIn mix-in classes can be used to
support asynchronous behaviour.
Creating a server requires several steps. First, you must create a request
handler class by subclassing the BaseRequestHandler class and
overriding its handle method; this method will process incoming
requests. Second, you must instantiate one of the server classes, passing it
the server's address and the request handler class. Finally, call the
handle_request or serve_forever method of the server object to
process one or many requests.
When inheriting from ThreadingMixIn for threaded connection behavior,
you should explicitly declare how you want your threads to behave on an abrupt
shutdown. The ThreadingMixIn class defines an attribute
{daemon_threads}, which indicates whether or not the server should wait for
thread termination. You should set the flag explicitly if you would like threads
to behave autonomously; the default is False, meaning that Python will
not exit until all threads created by ThreadingMixIn have exited.
Server classes have the same external methods and attributes, no matter what
network protocol they use.
Server Creation Notes
---------------------
There are five classes in an inheritance diagram, four of which represent
synchronous servers of four types:: >
+------------+
| BaseServer |
+------------+
|
v
+-----------+ +------------------+
| TCPServer |------->| UnixStreamServer |
+-----------+ +------------------+
|
v
+-----------+ +--------------------+
| UDPServer |------->| UnixDatagramServer |
+-----------+ +--------------------+
<
Note that UnixDatagramServer derives from UDPServer, not from
UnixStreamServer --- the only difference between an IP and a Unix
stream server is the address family, which is simply repeated in both Unix
server classes.
Forking and threading versions of each type of server can be created using the
ForkingMixIn and ThreadingMixIn mix-in classes. For instance,
a threading UDP server class is created as follows:: >
class ThreadingUDPServer(ThreadingMixIn, UDPServer): pass
<
The mix-in class must come first, since it overrides a method defined in
UDPServer. Setting the various member variables also changes the
behavior of the underlying server mechanism.
To implement a service, you must derive a class from BaseRequestHandler
and redefine its handle method. You can then run various versions of
the service by combining one of the server classes with your request handler
class. The request handler class must be different for datagram or stream
services. This can be hidden by using the handler subclasses
StreamRequestHandler or DatagramRequestHandler.
Of course, you still have to use your head! For instance, it makes no sense to
use a forking server if the service contains state in memory that can be
modified by different requests, since the modifications in the child process
would never reach the initial state kept in the parent process and passed to
each child. In this case, you can use a threading server, but you will probably
have to use locks to protect the integrity of the shared data.
On the other hand, if you are building an HTTP server where all data is stored
externally (for instance, in the file system), a synchronous class will
essentially render the service "deaf" while one request is being handled --
which may be for a very long time if a client is slow to receive all the data it
has requested. Here a threading or forking server is appropriate.
In some cases, it may be appropriate to process part of a request synchronously,
but to finish processing in a forked child depending on the request data. This
can be implemented by using a synchronous server and doing an explicit fork in
the request handler class handle method.
Another approach to handling multiple simultaneous requests in an environment
that supports neither threads nor fork (or where these are too expensive
or inappropriate for the service) is to maintain an explicit table of partially
finished requests and to use select (|py2stdlib-select|) to decide which request to work on
next (or whether to handle a new incoming request). This is particularly
important for stream services where each client can potentially be connected for
a long time (if threads or subprocesses cannot be used). See asyncore (|py2stdlib-asyncore|) for
another way to manage this.
.. XXX should data and methods be intermingled, or separate?
how should the distinction between class and instance variables be drawn?
Server Objects
--------------
BaseServer~
This is the superclass of all Server objects in the module. It defines the
interface, given below, but does not implement most of the methods, which is
done in subclasses.
BaseServer.fileno()~
Return an integer file descriptor for the socket on which the server is
listening. This function is most commonly passed to select.select, to
allow monitoring multiple servers in the same process.
BaseServer.handle_request()~
Process a single request. This function calls the following methods in
order: get_request, verify_request, and
process_request. If the user-provided handle method of the
handler class raises an exception, the server's handle_error method
will be called. If no request is received within self.timeout
seconds, handle_timeout will be called and handle_request
will return.
BaseServer.serve_forever(poll_interval=0.5)~
Handle requests until an explicit shutdown request. Polls for
shutdown every {poll_interval} seconds.
BaseServer.shutdown()~
Tells the serve_forever loop to stop and waits until it does.
.. versionadded:: 2.6
BaseServer.address_family~
The family of protocols to which the server's socket belongs.
Common examples are socket.AF_INET and socket.AF_UNIX.
BaseServer.RequestHandlerClass~
The user-provided request handler class; an instance of this class is created
for each request.
BaseServer.server_address~
The address on which the server is listening. The format of addresses varies
depending on the protocol family; see the documentation for the socket module
for details. For Internet protocols, this is a tuple containing a string giving
the address, and an integer port number: ``('127.0.0.1', 80)``, for example.
BaseServer.socket~
The socket object on which the server will listen for incoming requests.
The server classes support the following class variables:
.. XXX should class variables be covered before instance variables, or vice versa?
BaseServer.allow_reuse_address~
Whether the server will allow the reuse of an address. This defaults to
False, and can be set in subclasses to change the policy.
BaseServer.request_queue_size~
The size of the request queue. If it takes a long time to process a single
request, any requests that arrive while the server is busy are placed into a
queue, up to request_queue_size requests. Once the queue is full,
further requests from clients will get a "Connection denied" error. The default
value is usually 5, but this can be overridden by subclasses.
BaseServer.socket_type~
The type of socket used by the server; socket.SOCK_STREAM and
socket.SOCK_DGRAM are two common values.
BaseServer.timeout~
Timeout duration, measured in seconds, or None if no timeout is
desired. If handle_request receives no incoming requests within the
timeout period, the handle_timeout method is called.
There are various server methods that can be overridden by subclasses of base
server classes like TCPServer; these methods aren't useful to external
users of the server object.
.. XXX should the default implementations of these be documented, or should
it be assumed that the user will look at SocketServer.py?
BaseServer.finish_request()~
Actually processes the request by instantiating RequestHandlerClass and
calling its handle method.
BaseServer.get_request()~
Must accept a request from the socket, and return a 2-tuple containing the {new}
socket object to be used to communicate with the client, and the client's
address.
BaseServer.handle_error(request, client_address)~
This function is called if the RequestHandlerClass's handle
method raises an exception. The default action is to print the traceback to
standard output and continue handling further requests.
BaseServer.handle_timeout()~
This function is called when the timeout attribute has been set to a
value other than None and the timeout period has passed with no
requests being received. The default action for forking servers is
to collect the status of any child processes that have exited, while
in threading servers this method does nothing.
BaseServer.process_request(request, client_address)~
Calls finish_request to create an instance of the
RequestHandlerClass. If desired, this function can create a new process
or thread to handle the request; the ForkingMixIn and
ThreadingMixIn classes do this.
.. Is there any point in documenting the following two functions?
What would the purpose of overriding them be: initializing server
instance variables, adding new network families?
BaseServer.server_activate()~
Called by the server's constructor to activate the server. The default behavior
just listen\ s to the server's socket. May be overridden.
BaseServer.server_bind()~
Called by the server's constructor to bind the socket to the desired address.
May be overridden.
BaseServer.verify_request(request, client_address)~
Must return a Boolean value; if the value is True, the request will be
processed, and if it's False, the request will be denied. This function
can be overridden to implement access controls for a server. The default
implementation always returns True.
RequestHandler Objects
----------------------
The request handler class must define a new handle method, and can
override any of the following methods. A new instance is created for each
request.
RequestHandler.finish()~
Called after the handle method to perform any clean-up actions
required. The default implementation does nothing. If setup or
handle raise an exception, this function will not be called.
RequestHandler.handle()~
This function must do all the work required to service a request. The
default implementation does nothing. Several instance attributes are
available to it; the request is available as self.request; the client
address as self.client_address; and the server instance as
self.server, in case it needs access to per-server information.
The type of self.request is different for datagram or stream
services. For stream services, self.request is a socket object; for
datagram services, self.request is a pair of string and socket.
However, this can be hidden by using the request handler subclasses
StreamRequestHandler or DatagramRequestHandler, which
override the setup and finish methods, and provide
self.rfile and self.wfile attributes. self.rfile and
self.wfile can be read or written, respectively, to get the request
data or return data to the client.
RequestHandler.setup()~
Called before the handle method to perform any initialization actions
required. The default implementation does nothing.
Examples
--------
SocketServer.TCPServer Example
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This is the server side:: >
import SocketServer
class MyTCPHandler(SocketServer.BaseRequestHandler):
"""
The RequestHandler class for our server.
It is instantiated once per connection to the server, and must
override the handle() method to implement communication to the
client.
"""
def handle(self):
# self.request is the TCP socket connected to the client
self.data = self.request.recv(1024).strip()
print "%s wrote:" % self.client_address[0]
print self.data
# just send back the same data, but upper-cased
self.request.send(self.data.upper())
if __name__ == "__main__":
HOST, PORT = "localhost", 9999
# Create the server, binding to localhost on port 9999
server = SocketServer.TCPServer((HOST, PORT), MyTCPHandler)
# Activate the server; this will keep running until you
# interrupt the program with Ctrl-C
server.serve_forever()
<
An alternative request handler class that makes use of streams (file-like
objects that simplify communication by providing the standard file interface):: >
class MyTCPHandler(SocketServer.StreamRequestHandler):
def handle(self):
# self.rfile is a file-like object created by the handler;
# we can now use e.g. readline() instead of raw recv() calls
self.data = self.rfile.readline().strip()
print "%s wrote:" % self.client_address[0]
print self.data
# Likewise, self.wfile is a file-like object used to write back
# to the client
self.wfile.write(self.data.upper())
<
The difference is that the ``readline()`` call in the second handler will call
``recv()`` multiple times until it encounters a newline character, while the
single ``recv()`` call in the first handler will just return what has been sent
from the client in one ``send()`` call.
This is the client side:: >
import socket
import sys
HOST, PORT = "localhost", 9999
data = " ".join(sys.argv[1:])
# Create a socket (SOCK_STREAM means a TCP socket)
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Connect to server and send data
sock.connect((HOST, PORT))
sock.send(data + "\n")
# Receive data from the server and shut down
received = sock.recv(1024)
sock.close()
print "Sent: %s" % data
print "Received: %s" % received
<
The output of the example should look something like this:
Server:: >
$ python TCPServer.py
127.0.0.1 wrote:
hello world with TCP
127.0.0.1 wrote:
python is nice
<
Client::
$ python TCPClient.py hello world with TCP
Sent: hello world with TCP
Received: HELLO WORLD WITH TCP
$ python TCPClient.py python is nice
Sent: python is nice
Received: PYTHON IS NICE
SocketServer.UDPServer Example
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This is the server side:: >
import SocketServer
class MyUDPHandler(SocketServer.BaseRequestHandler):
"""
This class works similar to the TCP handler class, except that
self.request consists of a pair of data and client socket, and since
there is no connection the client address must be given explicitly
when sending data back via sendto().
"""
def handle(self):
data = self.request[0].strip()
socket = self.request[1]
print "%s wrote:" % self.client_address[0]
print data
socket.sendto(data.upper(), self.client_address)
if __name__ == "__main__":
HOST, PORT = "localhost", 9999
server = SocketServer.UDPServer((HOST, PORT), MyUDPHandler)
server.serve_forever()
<
This is the client side::
import socket
import sys
HOST, PORT = "localhost", 9999
data = " ".join(sys.argv[1:])
# SOCK_DGRAM is the socket type to use for UDP sockets
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
# As you can see, there is no connect() call; UDP has no connections.
# Instead, data is directly sent to the recipient via sendto().
sock.sendto(data + "\n", (HOST, PORT))
received = sock.recv(1024)
print "Sent: %s" % data
print "Received: %s" % received
The output of the example should look exactly like for the TCP server example.
Asynchronous Mixins
~~~~~~~~~~~~~~~~~~~
To build asynchronous handlers, use the ThreadingMixIn and
ForkingMixIn classes.
An example for the ThreadingMixIn class:: >
import socket
import threading
import SocketServer
class ThreadedTCPRequestHandler(SocketServer.BaseRequestHandler):
def handle(self):
data = self.request.recv(1024)
cur_thread = threading.currentThread()
response = "%s: %s" % (cur_thread.getName(), data)
self.request.send(response)
class ThreadedTCPServer(SocketServer.ThreadingMixIn, SocketServer.TCPServer):
pass
def client(ip, port, message):
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect((ip, port))
sock.send(message)
response = sock.recv(1024)
print "Received: %s" % response
sock.close()
if __name__ == "__main__":
# Port 0 means to select an arbitrary unused port
HOST, PORT = "localhost", 0
server = ThreadedTCPServer((HOST, PORT), ThreadedTCPRequestHandler)
ip, port = server.server_address
# Start a thread with the server -- that thread will then start one
# more thread for each request
server_thread = threading.Thread(target=server.serve_forever)
# Exit the server thread when the main thread terminates
server_thread.setDaemon(True)
server_thread.start()
print "Server loop running in thread:", server_thread.getName()
client(ip, port, "Hello World 1")
client(ip, port, "Hello World 2")
client(ip, port, "Hello World 3")
server.shutdown()
<
The output of the example should look something like this::
$ python ThreadedTCPServer.py
Server loop running in thread: Thread-1
Received: Thread-2: Hello World 1
Received: Thread-3: Hello World 2
Received: Thread-4: Hello World 3
The ForkingMixIn class is used in the same way, except that the server
will spawn a new process for each request.
==============================================================================
*py2stdlib-spwd*
spwd~
:platform: Unix
:synopsis: The shadow password database (getspnam() and friends).
.. versionadded:: 2.5
This module provides access to the Unix shadow password database. It is
available on various Unix versions.
You must have enough privileges to access the shadow password database (this
usually means you have to be root).
Shadow password database entries are reported as a tuple-like object, whose
attributes correspond to the members of the ``spwd`` structure (Attribute field
below, see ``<shadow.h>``):
+-------+---------------+---------------------------------+
| Index | Attribute | Meaning |
+=======+===============+=================================+
| 0 | ``sp_nam`` | Login name |
+-------+---------------+---------------------------------+
| 1 | ``sp_pwd`` | Encrypted password |
+-------+---------------+---------------------------------+
| 2 | ``sp_lstchg`` | Date of last change |
+-------+---------------+---------------------------------+
| 3 | ``sp_min`` | Minimal number of days between |
| | | changes |
+-------+---------------+---------------------------------+
| 4 | ``sp_max`` | Maximum number of days between |
| | | changes |
+-------+---------------+---------------------------------+
| 5 | ``sp_warn`` | Number of days before password |
| | | expires to warn user about it |
+-------+---------------+---------------------------------+
| 6 | ``sp_inact`` | Number of days after password |
| | | expires until account is |
| | | blocked |
+-------+---------------+---------------------------------+
| 7 | ``sp_expire`` | Number of days since 1970-01-01 |
| | | until account is disabled |
+-------+---------------+---------------------------------+
| 8 | ``sp_flag`` | Reserved |
+-------+---------------+---------------------------------+
The sp_nam and sp_pwd items are strings, all others are integers.
KeyError is raised if the entry asked for cannot be found.
It defines the following items:
getspnam(name)~
Return the shadow password database entry for the given user name.
getspall()~
Return a list of all available shadow password database entries, in arbitrary
order.
.. seealso::
Module grp (|py2stdlib-grp|)
An interface to the group database, similar to this.
Module pwd (|py2stdlib-pwd|)
An interface to the normal password database, similar to this.
==============================================================================
*py2stdlib-sqlite3*
sqlite3~
:synopsis: A DB-API 2.0 implementation using SQLite 3.x.
.. versionadded:: 2.5
SQLite is a C library that provides a lightweight disk-based database that
doesn't require a separate server process and allows accessing the database
using a nonstandard variant of the SQL query language. Some applications can use
SQLite for internal data storage. It's also possible to prototype an
application using SQLite and then port the code to a larger database such as
PostgreSQL or Oracle.
sqlite3 was written by Gerhard Häring and provides a SQL interface compliant
with the DB-API 2.0 specification described by 249.
To use the module, you must first create a Connection object that
represents the database. Here the data will be stored in the
/tmp/example file:: >
conn = sqlite3.connect('/tmp/example')
<
You can also supply the special name `` to create a database in RAM.
Once you have a Connection, you can create a Cursor object
and call its Cursor.execute method to perform SQL commands:: >
c = conn.cursor()
# Create table
c.execute('''create table stocks
(date text, trans text, symbol text,
qty real, price real)''')
# Insert a row of data
c.execute("""insert into stocks
values ('2006-01-05','BUY','RHAT',100,35.14)""")
# Save (commit) the changes
conn.commit()
# We can also close the cursor if we are done with it
c.close()
<
Usually your SQL operations will need to use values from Python variables. You
shouldn't assemble your query using Python's string operations because doing so
is insecure; it makes your program vulnerable to an SQL injection attack.
Instead, use the DB-API's parameter substitution. Put ``?`` as a placeholder
wherever you want to use a value, and then provide a tuple of values as the
second argument to the cursor's Cursor.execute method. (Other database
modules may use a different placeholder, such as ``%s`` or ``:1``.) For
example:: >
# Never do this -- insecure!
symbol = 'IBM'
c.execute("... where symbol = '%s'" % symbol)
# Do this instead
t = (symbol,)
c.execute('select * from stocks where symbol=?', t)
# Larger example
for t in [('2006-03-28', 'BUY', 'IBM', 1000, 45.00),
('2006-04-05', 'BUY', 'MSOFT', 1000, 72.00),
('2006-04-06', 'SELL', 'IBM', 500, 53.00),
]:
c.execute('insert into stocks values (?,?,?,?,?)', t)
<
To retrieve data after executing a SELECT statement, you can either treat the
cursor as an iterator, call the cursor's Cursor.fetchone method to
retrieve a single matching row, or call Cursor.fetchall to get a list of the
matching rows.
This example uses the iterator form:: >
>>> c = conn.cursor()
>>> c.execute('select * from stocks order by price')
>>> for row in c:
... print row
...
(u'2006-01-05', u'BUY', u'RHAT', 100, 35.14)
(u'2006-03-28', u'BUY', u'IBM', 1000, 45.0)
(u'2006-04-06', u'SELL', u'IBM', 500, 53.0)
(u'2006-04-05', u'BUY', u'MSOFT', 1000, 72.0)
>>>
<
.. seealso::
http://code.google.com/p/pysqlite/
The pysqlite web page -- sqlite3 is developed externally under the name
"pysqlite".
http://www.sqlite.org
The SQLite web page; the documentation describes the syntax and the
available data types for the supported SQL dialect.
249 - Database API Specification 2.0
PEP written by Marc-André Lemburg.
Module functions and constants
------------------------------
PARSE_DECLTYPES~
This constant is meant to be used with the {detect_types} parameter of the
connect function.
Setting it makes the sqlite3 (|py2stdlib-sqlite3|) module parse the declared type for each
column it returns. It will parse out the first word of the declared type,
i. e. for "integer primary key", it will parse out "integer", or for
"number(10)" it will parse out "number". Then for that column, it will look
into the converters dictionary and use the converter function registered for
that type there.
PARSE_COLNAMES~
This constant is meant to be used with the {detect_types} parameter of the
connect function.
Setting this makes the SQLite interface parse the column name for each column it
returns. It will look for a string formed [mytype] in there, and then decide
that 'mytype' is the type of the column. It will try to find an entry of
'mytype' in the converters dictionary and then use the converter function found
there to return the value. The column name found in Cursor.description
is only the first word of the column name, i. e. if you use something like
``'as "x [datetime]"'`` in your SQL, then we will parse out everything until the
first blank for the column name: the column name would simply be "x".
connect(database[, timeout, isolation_level, detect_types, factory])~
Opens a connection to the SQLite database file {database}. You can use
``":memory:"`` to open a database connection to a database that resides in RAM
instead of on disk.
When a database is accessed by multiple connections, and one of the processes
modifies the database, the SQLite database is locked until that transaction is
committed. The {timeout} parameter specifies how long the connection should wait
for the lock to go away until raising an exception. The default for the timeout
parameter is 5.0 (five seconds).
For the {isolation_level} parameter, please see the
Connection.isolation_level property of Connection objects.
SQLite natively supports only the types TEXT, INTEGER, FLOAT, BLOB and NULL. If
you want to use other types you must add support for them yourself. The
{detect_types} parameter and the using custom {converters}* registered with the
module-level register_converter function allow you to easily do that.
{detect_types} defaults to 0 (i. e. off, no type detection), you can set it to
any combination of PARSE_DECLTYPES and PARSE_COLNAMES to turn
type detection on.
By default, the sqlite3 (|py2stdlib-sqlite3|) module uses its Connection class for the
connect call. You can, however, subclass the Connection class and make
connect use your class instead by providing your class for the {factory}
parameter.
Consult the section sqlite3-types of this manual for details.
The sqlite3 (|py2stdlib-sqlite3|) module internally uses a statement cache to avoid SQL parsing
overhead. If you want to explicitly set the number of statements that are cached
for the connection, you can set the {cached_statements} parameter. The currently
implemented default is to cache 100 statements.
register_converter(typename, callable)~
Registers a callable to convert a bytestring from the database into a custom
Python type. The callable will be invoked for all database values that are of
the type {typename}. Confer the parameter {detect_types} of the connect
function for how the type detection works. Note that the case of {typename} and
the name of the type in your query must match!
register_adapter(type, callable)~
Registers a callable to convert the custom Python type {type} into one of
SQLite's supported types. The callable {callable} accepts as single parameter
the Python value, and must return a value of the following types: int, long,
float, str (UTF-8 encoded), unicode or buffer.
complete_statement(sql)~
Returns True if the string {sql} contains one or more complete SQL
statements terminated by semicolons. It does not verify that the SQL is
syntactically correct, only that there are no unclosed string literals and the
statement is terminated by a semicolon.
This can be used to build a shell for SQLite, as in the following example:
.. literalinclude:: ../includes/sqlite3/complete_statement.py
enable_callback_tracebacks(flag)~
By default you will not get any tracebacks in user-defined functions,
aggregates, converters, authorizer callbacks etc. If you want to debug them, you
can call this function with {flag} as True. Afterwards, you will get tracebacks
from callbacks on ``sys.stderr``. Use False to disable the feature
again.
Connection Objects
------------------
Connection~
A SQLite database connection has the following attributes and methods:
Connection.isolation_level~
Get or set the current isolation level. None for autocommit mode or
one of "DEFERRED", "IMMEDIATE" or "EXCLUSIVE". See section
sqlite3-controlling-transactions for a more detailed explanation.
Connection.cursor([cursorClass])~
The cursor method accepts a single optional parameter {cursorClass}. If
supplied, this must be a custom cursor class that extends
sqlite3.Cursor.
Connection.commit()~
This method commits the current transaction. If you don't call this method,
anything you did since the last call to ``commit()`` is not visible from from
other database connections. If you wonder why you don't see the data you've
written to the database, please check you didn't forget to call this method.
Connection.rollback()~
This method rolls back any changes to the database since the last call to
commit.
Connection.close()~
This closes the database connection. Note that this does not automatically
call commit. If you just close your database connection without
calling commit first, your changes will be lost!
Connection.execute(sql, [parameters])~
This is a nonstandard shortcut that creates an intermediate cursor object by
calling the cursor method, then calls the cursor's
execute<Cursor.execute> method with the parameters given.
Connection.executemany(sql, [parameters])~
This is a nonstandard shortcut that creates an intermediate cursor object by
calling the cursor method, then calls the cursor's
executemany<Cursor.executemany> method with the parameters given.
Connection.executescript(sql_script)~
This is a nonstandard shortcut that creates an intermediate cursor object by
calling the cursor method, then calls the cursor's
executescript<Cursor.executescript> method with the parameters
given.
Connection.create_function(name, num_params, func)~
Creates a user-defined function that you can later use from within SQL
statements under the function name {name}. {num_params} is the number of
parameters the function accepts, and {func} is a Python callable that is called
as the SQL function.
The function can return any of the types supported by SQLite: unicode, str, int,
long, float, buffer and None.
Example:
.. literalinclude:: ../includes/sqlite3/md5func.py
Connection.create_aggregate(name, num_params, aggregate_class)~
Creates a user-defined aggregate function.
The aggregate class must implement a ``step`` method, which accepts the number
of parameters {num_params}, and a ``finalize`` method which will return the
final result of the aggregate.
The ``finalize`` method can return any of the types supported by SQLite:
unicode, str, int, long, float, buffer and None.
Example:
.. literalinclude:: ../includes/sqlite3/mysumaggr.py
Connection.create_collation(name, callable)~
Creates a collation with the specified {name} and {callable}. The callable will
be passed two string arguments. It should return -1 if the first is ordered
lower than the second, 0 if they are ordered equal and 1 if the first is ordered
higher than the second. Note that this controls sorting (ORDER BY in SQL) so
your comparisons don't affect other SQL operations.
Note that the callable will get its parameters as Python bytestrings, which will
normally be encoded in UTF-8.
The following example shows a custom collation that sorts "the wrong way":
.. literalinclude:: ../includes/sqlite3/collation_reverse.py
To remove a collation, call ``create_collation`` with None as callable:: >
con.create_collation("reverse", None)
<
Connection.interrupt()~
You can call this method from a different thread to abort any queries that might
be executing on the connection. The query will then abort and the caller will
get an exception.
Connection.set_authorizer(authorizer_callback)~
This routine registers a callback. The callback is invoked for each attempt to
access a column of a table in the database. The callback should return
SQLITE_OK if access is allowed, SQLITE_DENY if the entire SQL
statement should be aborted with an error and SQLITE_IGNORE if the
column should be treated as a NULL value. These constants are available in the
sqlite3 (|py2stdlib-sqlite3|) module.
The first argument to the callback signifies what kind of operation is to be
authorized. The second and third argument will be arguments or None
depending on the first argument. The 4th argument is the name of the database
("main", "temp", etc.) if applicable. The 5th argument is the name of the
inner-most trigger or view that is responsible for the access attempt or
None if this access attempt is directly from input SQL code.
Please consult the SQLite documentation about the possible values for the first
argument and the meaning of the second and third argument depending on the first
one. All necessary constants are available in the sqlite3 (|py2stdlib-sqlite3|) module.
Connection.set_progress_handler(handler, n)~
.. versionadded:: 2.6
This routine registers a callback. The callback is invoked for every {n}
instructions of the SQLite virtual machine. This is useful if you want to
get called from SQLite during long-running operations, for example to update
a GUI.
If you want to clear any previously installed progress handler, call the
method with None for {handler}.
Connection.enable_load_extension(enabled)~
.. versionadded:: 2.7
This routine allows/disallows the SQLite engine to load SQLite extensions
from shared libraries. SQLite extensions can define new functions,
aggregates or whole new virtual table implementations. One well-known
extension is the fulltext-search extension distributed with SQLite.
.. literalinclude:: ../includes/sqlite3/load_extension.py
Connection.load_extension(path)~
.. versionadded:: 2.7
This routine loads a SQLite extension from a shared library. You have to
enable extension loading with ``enable_load_extension`` before you can use
this routine.
Connection.row_factory~
You can change this attribute to a callable that accepts the cursor and the
original row as a tuple and will return the real result row. This way, you can
implement more advanced ways of returning results, such as returning an object
that can also access columns by name.
Example:
.. literalinclude:: ../includes/sqlite3/row_factory.py
If returning a tuple doesn't suffice and you want name-based access to
columns, you should consider setting row_factory to the
highly-optimized sqlite3.Row type. Row provides both
index-based and case-insensitive name-based access to columns with almost no
memory overhead. It will probably be better than your own custom
dictionary-based approach or even a db_row based solution.
.. XXX what's a db_row-based solution?
Connection.text_factory~
Using this attribute you can control what objects are returned for the ``TEXT``
data type. By default, this attribute is set to unicode and the
sqlite3 (|py2stdlib-sqlite3|) module will return Unicode objects for ``TEXT``. If you want to
return bytestrings instead, you can set it to str.
For efficiency reasons, there's also a way to return Unicode objects only for
non-ASCII data, and bytestrings otherwise. To activate it, set this attribute to
sqlite3.OptimizedUnicode.
You can also set it to any other callable that accepts a single bytestring
parameter and returns the resulting object.
See the following example code for illustration:
.. literalinclude:: ../includes/sqlite3/text_factory.py
Connection.total_changes~
Returns the total number of database rows that have been modified, inserted, or
deleted since the database connection was opened.
Connection.iterdump~
Returns an iterator to dump the database in an SQL text format. Useful when
saving an in-memory database for later restoration. This function provides
the same capabilities as the .dump command in the sqlite3 (|py2stdlib-sqlite3|)
shell.
.. versionadded:: 2.6
Example:: >
# Convert file existing_db.db to SQL dump file dump.sql
import sqlite3, os
con = sqlite3.connect('existing_db.db')
with open('dump.sql', 'w') as f:
for line in con.iterdump():
f.write('%s\n' % line)
<
Cursor Objects
A Cursor instance has the following attributes and methods:
A SQLite database cursor has the following attributes and methods:
Cursor.execute(sql, [parameters])~
Executes an SQL statement. The SQL statement may be parametrized (i. e.
placeholders instead of SQL literals). The sqlite3 (|py2stdlib-sqlite3|) module supports two
kinds of placeholders: question marks (qmark style) and named placeholders
(named style).
This example shows how to use parameters with qmark style:
.. literalinclude:: ../includes/sqlite3/execute_1.py
This example shows how to use the named style:
.. literalinclude:: ../includes/sqlite3/execute_2.py
execute will only execute a single SQL statement. If you try to execute
more than one statement with it, it will raise a Warning. Use
executescript if you want to execute multiple SQL statements with one
call.
Cursor.executemany(sql, seq_of_parameters)~
Executes an SQL command against all parameter sequences or mappings found in
the sequence {sql}. The sqlite3 (|py2stdlib-sqlite3|) module also allows using an
iterator yielding parameters instead of a sequence.
.. literalinclude:: ../includes/sqlite3/executemany_1.py
Here's a shorter example using a generator:
.. literalinclude:: ../includes/sqlite3/executemany_2.py
Cursor.executescript(sql_script)~
This is a nonstandard convenience method for executing multiple SQL statements
at once. It issues a ``COMMIT`` statement first, then executes the SQL script it
gets as a parameter.
{sql_script} can be a bytestring or a Unicode string.
Example:
.. literalinclude:: ../includes/sqlite3/executescript.py
Cursor.fetchone()~
Fetches the next row of a query result set, returning a single sequence,
or None when no more data is available.
Cursor.fetchmany([size=cursor.arraysize])~
Fetches the next set of rows of a query result, returning a list. An empty
list is returned when no more rows are available.
The number of rows to fetch per call is specified by the {size} parameter.
If it is not given, the cursor's arraysize determines the number of rows
to be fetched. The method should try to fetch as many rows as indicated by
the size parameter. If this is not possible due to the specified number of
rows not being available, fewer rows may be returned.
Note there are performance considerations involved with the {size} parameter.
For optimal performance, it is usually best to use the arraysize attribute.
If the {size} parameter is used, then it is best for it to retain the same
value from one fetchmany call to the next.
Cursor.fetchall()~
Fetches all (remaining) rows of a query result, returning a list. Note that
the cursor's arraysize attribute can affect the performance of this operation.
An empty list is returned when no rows are available.
Cursor.rowcount~
Although the Cursor class of the sqlite3 (|py2stdlib-sqlite3|) module implements this
attribute, the database engine's own support for the determination of "rows
affected"/"rows selected" is quirky.
For ``DELETE`` statements, SQLite reports rowcount as 0 if you make a
``DELETE FROM table`` without any condition.
For executemany statements, the number of modifications are summed up
into rowcount.
As required by the Python DB API Spec, the rowcount attribute "is -1 in
case no ``executeXX()`` has been performed on the cursor or the rowcount of the
last operation is not determinable by the interface".
This includes ``SELECT`` statements because we cannot determine the number of
rows a query produced until all rows were fetched.
Cursor.lastrowid~
This read-only attribute provides the rowid of the last modified row. It is
only set if you issued a ``INSERT`` statement using the execute
method. For operations other than ``INSERT`` or when executemany is
called, lastrowid is set to None.
Cursor.description~
This read-only attribute provides the column names of the last query. To
remain compatible with the Python DB API, it returns a 7-tuple for each
column where the last six items of each tuple are None.
It is set for ``SELECT`` statements without any matching rows as well.
Row Objects
-----------
Row~
A Row instance serves as a highly optimized
Connection.row_factory for Connection objects.
It tries to mimic a tuple in most of its features.
It supports mapping access by column name and index, iteration,
representation, equality testing and len.
If two Row objects have exactly the same columns and their
members are equal, they compare equal.
.. versionchanged:: 2.6
Added iteration and equality (hashability).
keys~
This method returns a tuple of column names. Immediately after a query,
it is the first member of each tuple in Cursor.description.
.. versionadded:: 2.6
Let's assume we initialize a table as in the example given above:: >
conn = sqlite3.connect(":memory:")
c = conn.cursor()
c.execute('''create table stocks
(date text, trans text, symbol text,
qty real, price real)''')
c.execute("""insert into stocks
values ('2006-01-05','BUY','RHAT',100,35.14)""")
conn.commit()
c.close()
<
Now we plug Row in::
>>> conn.row_factory = sqlite3.Row
>>> c = conn.cursor()
>>> c.execute('select * from stocks')
<sqlite3.Cursor object at 0x7f4e7dd8fa80>
>>> r = c.fetchone()
>>> type(r)
<type 'sqlite3.Row'>
>>> r
(u'2006-01-05', u'BUY', u'RHAT', 100.0, 35.14)
>>> len(r)
5
>>> r[2]
u'RHAT'
>>> r.keys()
['date', 'trans', 'symbol', 'qty', 'price']
>>> r['qty']
100.0
>>> for member in r: print member
...
2006-01-05
BUY
RHAT
100.0
35.14
SQLite and Python types
-----------------------
Introduction
^^^^^^^^^^^^
SQLite natively supports the following types: ``NULL``, ``INTEGER``,
``REAL``, ``TEXT``, ``BLOB``.
The following Python types can thus be sent to SQLite without any problem:
+-----------------------------+-------------+
| Python type | SQLite type |
+=============================+=============+
| None | ``NULL`` |
+-----------------------------+-------------+
| int | ``INTEGER`` |
+-----------------------------+-------------+
| long | ``INTEGER`` |
+-----------------------------+-------------+
| float | ``REAL`` |
+-----------------------------+-------------+
| str (UTF8-encoded) | ``TEXT`` |
+-----------------------------+-------------+
| unicode | ``TEXT`` |
+-----------------------------+-------------+
| buffer | ``BLOB`` |
+-----------------------------+-------------+
This is how SQLite types are converted to Python types by default:
+-------------+----------------------------------------------+
| SQLite type | Python type |
+=============+==============================================+
| ``NULL`` | None |
+-------------+----------------------------------------------+
| ``INTEGER`` | int or long, |
| | depending on size |
+-------------+----------------------------------------------+
| ``REAL`` | float |
+-------------+----------------------------------------------+
| ``TEXT`` | depends on Connection.text_factory, |
| | unicode by default |
+-------------+----------------------------------------------+
| ``BLOB`` | buffer |
+-------------+----------------------------------------------+
The type system of the sqlite3 (|py2stdlib-sqlite3|) module is extensible in two ways: you can
store additional Python types in a SQLite database via object adaptation, and
you can let the sqlite3 (|py2stdlib-sqlite3|) module convert SQLite types to different Python
types via converters.
Using adapters to store additional Python types in SQLite databases
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
As described before, SQLite supports only a limited set of types natively. To
use other Python types with SQLite, you must {adapt}* them to one of the
sqlite3 module's supported types for SQLite: one of NoneType, int, long, float,
str, unicode, buffer.
The sqlite3 (|py2stdlib-sqlite3|) module uses Python object adaptation, as described in
246 for this. The protocol to use is PrepareProtocol.
There are two ways to enable the sqlite3 (|py2stdlib-sqlite3|) module to adapt a custom Python
type to one of the supported ones.
Letting your object adapt itself
""""""""""""""""""""""""""""""""
This is a good approach if you write the class yourself. Let's suppose you have
a class like this:: >
class Point(object):
def __init__(self, x, y):
self.x, self.y = x, y
<
Now you want to store the point in a single SQLite column. First you'll have to
choose one of the supported types first to be used for representing the point.
Let's just use str and separate the coordinates using a semicolon. Then you need
to give your class a method ``__conform__(self, protocol)`` which must return
the converted value. The parameter {protocol} will be PrepareProtocol.
.. literalinclude:: ../includes/sqlite3/adapter_point_1.py
Registering an adapter callable
"""""""""""""""""""""""""""""""
The other possibility is to create a function that converts the type to the
string representation and register the function with register_adapter.
.. note::
The type/class to adapt must be a new-style class, i. e. it must have
object as one of its bases.
.. literalinclude:: ../includes/sqlite3/adapter_point_2.py
The sqlite3 (|py2stdlib-sqlite3|) module has two default adapters for Python's built-in
datetime.date and datetime.datetime types. Now let's suppose
we want to store datetime.datetime objects not in ISO representation,
but as a Unix timestamp.
.. literalinclude:: ../includes/sqlite3/adapter_datetime.py
Converting SQLite values to custom Python types
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Writing an adapter lets you send custom Python types to SQLite. But to make it
really useful we need to make the Python to SQLite to Python roundtrip work.
Enter converters.
Let's go back to the Point class. We stored the x and y coordinates
separated via semicolons as strings in SQLite.
First, we'll define a converter function that accepts the string as a parameter
and constructs a Point object from it.
.. note::
Converter functions {always}* get called with a string, no matter under which
data type you sent the value to SQLite.
:: >
def convert_point(s):
x, y = map(float, s.split(";"))
return Point(x, y)
<
Now you need to make the sqlite3 (|py2stdlib-sqlite3|) module know that what you select from
the database is actually a point. There are two ways of doing this:
* Implicitly via the declared type
* Explicitly via the column name
Both ways are described in section sqlite3-module-contents, in the entries
for the constants PARSE_DECLTYPES and PARSE_COLNAMES.
The following example illustrates both approaches.
.. literalinclude:: ../includes/sqlite3/converter_point.py
Default adapters and converters
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
There are default adapters for the date and datetime types in the datetime
module. They will be sent as ISO dates/ISO timestamps to SQLite.
The default converters are registered under the name "date" for
datetime.date and under the name "timestamp" for
datetime.datetime.
This way, you can use date/timestamps from Python without any additional
fiddling in most cases. The format of the adapters is also compatible with the
experimental SQLite date/time functions.
The following example demonstrates this.
.. literalinclude:: ../includes/sqlite3/pysqlite_datetime.py
Controlling Transactions
------------------------
By default, the sqlite3 (|py2stdlib-sqlite3|) module opens transactions implicitly before a
Data Modification Language (DML) statement (i.e.
``INSERT``/``UPDATE``/``DELETE``/``REPLACE``), and commits transactions
implicitly before a non-DML, non-query statement (i. e.
anything other than ``SELECT`` or the aforementioned).
So if you are within a transaction and issue a command like ``CREATE TABLE
...``, ``VACUUM``, ``PRAGMA``, the sqlite3 (|py2stdlib-sqlite3|) module will commit implicitly
before executing that command. There are two reasons for doing that. The first
is that some of these commands don't work within transactions. The other reason
is that sqlite3 needs to keep track of the transaction state (if a transaction
is active or not).
You can control which kind of ``BEGIN`` statements sqlite3 implicitly executes
(or none at all) via the {isolation_level} parameter to the connect
call, or via the isolation_level property of connections.
If you want {autocommit mode}*, then set isolation_level to None.
Otherwise leave it at its default, which will result in a plain "BEGIN"
statement, or set it to one of SQLite's supported isolation levels: "DEFERRED",
"IMMEDIATE" or "EXCLUSIVE".
Using sqlite3 (|py2stdlib-sqlite3|) efficiently
--------------------------------
Using shortcut methods
^^^^^^^^^^^^^^^^^^^^^^
Using the nonstandard execute, executemany and
executescript methods of the Connection object, your code can
be written more concisely because you don't have to create the (often
superfluous) Cursor objects explicitly. Instead, the Cursor
objects are created implicitly and these shortcut methods return the cursor
objects. This way, you can execute a ``SELECT`` statement and iterate over it
directly using only a single call on the Connection object.
.. literalinclude:: ../includes/sqlite3/shortcut_methods.py
Accessing columns by name instead of by index
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
One useful feature of the sqlite3 (|py2stdlib-sqlite3|) module is the built-in
sqlite3.Row class designed to be used as a row factory.
Rows wrapped with this class can be accessed both by index (like tuples) and
case-insensitively by name:
.. literalinclude:: ../includes/sqlite3/rowclass.py
Using the connection as a context manager
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. versionadded:: 2.6
Connection objects can be used as context managers
that automatically commit or rollback transactions. In the event of an
exception, the transaction is rolled back; otherwise, the transaction is
committed:
.. literalinclude:: ../includes/sqlite3/ctx_manager.py
==============================================================================
*py2stdlib-ssl*
ssl~
:synopsis: SSL wrapper for socket objects
.. versionadded:: 2.6
.. index:: single: OpenSSL; (use in module ssl)
.. index:: TLS, SSL, Transport Layer Security, Secure Sockets Layer
This module provides access to Transport Layer Security (often known as "Secure
Sockets Layer") encryption and peer authentication facilities for network
sockets, both client-side and server-side. This module uses the OpenSSL
library. It is available on all modern Unix systems, Windows, Mac OS X, and
probably additional platforms, as long as OpenSSL is installed on that platform.
.. note::
Some behavior may be platform dependent, since calls are made to the
operating system socket APIs. The installed version of OpenSSL may also
cause variations in behavior.
This section documents the objects and functions in the ``ssl`` module; for more
general information about TLS, SSL, and certificates, the reader is referred to
the documents in the "See Also" section at the bottom.
This module provides a class, ssl.SSLSocket, which is derived from the
socket.socket type, and provides a socket-like wrapper that also
encrypts and decrypts the data going over the socket with SSL. It supports
additional read and write methods, along with a method,
getpeercert, to retrieve the certificate of the other side of the
connection, and a method, cipher, to retrieve the cipher being used for
the secure connection.
Functions, Constants, and Exceptions
------------------------------------
SSLError~
Raised to signal an error from the underlying SSL implementation. This
signifies some problem in the higher-level encryption and authentication
layer that's superimposed on the underlying network connection. This error
is a subtype of socket.error, which in turn is a subtype of
IOError.
wrap_socket (sock, keyfile=None, certfile=None, server_side=False, cert_reqs=CERT_NONE, ssl_version={see docs}, ca_certs=None, do_handshake_on_connect=True, suppress_ragged_eofs=True, ciphers=None)~
Takes an instance ``sock`` of socket.socket, and returns an instance
of ssl.SSLSocket, a subtype of socket.socket, which wraps
the underlying socket in an SSL context. For client-side sockets, the
context construction is lazy; if the underlying socket isn't connected yet,
the context construction will be performed after connect is called on
the socket. For server-side sockets, if the socket has no remote peer, it is
assumed to be a listening socket, and the server-side SSL wrapping is
automatically performed on client connections accepted via the accept
method. wrap_socket may raise SSLError.
The ``keyfile`` and ``certfile`` parameters specify optional files which
contain a certificate to be used to identify the local side of the
connection. See the discussion of ssl-certificates for more
information on how the certificate is stored in the ``certfile``.
Often the private key is stored in the same file as the certificate; in this
case, only the ``certfile`` parameter need be passed. If the private key is
stored in a separate file, both parameters must be used. If the private key
is stored in the ``certfile``, it should come before the first certificate in
the certificate chain:: >
-----BEGIN RSA PRIVATE KEY-----
... (private key in base64 encoding) ...
-----END RSA PRIVATE KEY-----
-----BEGIN CERTIFICATE-----
... (certificate in base64 PEM encoding) ...
-----END CERTIFICATE-----
<
The parameter ``server_side`` is a boolean which identifies whether
server-side or client-side behavior is desired from this socket.
The parameter ``cert_reqs`` specifies whether a certificate is required from
the other side of the connection, and whether it will be validated if
provided. It must be one of the three values CERT_NONE
(certificates ignored), CERT_OPTIONAL (not required, but validated
if provided), or CERT_REQUIRED (required and validated). If the
value of this parameter is not CERT_NONE, then the ``ca_certs``
parameter must point to a file of CA certificates.
The ``ca_certs`` file contains a set of concatenated "certification
authority" certificates, which are used to validate certificates passed from
the other end of the connection. See the discussion of
ssl-certificates for more information about how to arrange the
certificates in this file.
The parameter ``ssl_version`` specifies which version of the SSL protocol to
use. Typically, the server chooses a particular protocol version, and the
client must adapt to the server's choice. Most of the versions are not
interoperable with the other versions. If not specified, for client-side
operation, the default SSL version is SSLv3; for server-side operation,
SSLv23. These version selections provide the most compatibility with other
versions.
Here's a table showing which versions in a client (down the side) can connect
to which versions in a server (along the top):
.. table:: >
======================== ========= ========= ========== =========
{client} / {server}{ }{SSLv2}{ }{SSLv3}{ }{SSLv23}{ }{TLSv1}*
------------------------ --------- --------- ---------- ---------
{SSLv2} yes no yes no
{SSLv3} yes yes yes no
{SSLv23} yes no yes no
{TLSv1} no no yes yes
======================== ========= ========= ========== =========
<
.. note::
Which connections succeed will vary depending on the version of
OpenSSL. For instance, in some older versions of OpenSSL (such
as 0.9.7l on OS X 10.4), an SSLv2 client could not connect to an
SSLv23 server. Another example: beginning with OpenSSL 1.0.0,
an SSLv23 client will not actually attempt SSLv2 connections
unless you explicitly enable SSLv2 ciphers; for example, you
might specify ``"ALL"`` or ``"SSLv2"`` as the {ciphers} parameter
to enable them.
The {ciphers} parameter sets the available ciphers for this SSL object.
It should be a string in the `OpenSSL cipher list format
<http://www.openssl.org/docs/apps/ciphers.html#CIPHER_LIST_FORMAT>`_.
The parameter ``do_handshake_on_connect`` specifies whether to do the SSL
handshake automatically after doing a socket.connect, or whether the
application program will call it explicitly, by invoking the
SSLSocket.do_handshake method. Calling
SSLSocket.do_handshake explicitly gives the program control over the
blocking behavior of the socket I/O involved in the handshake.
The parameter ``suppress_ragged_eofs`` specifies how the
SSLSocket.read method should signal unexpected EOF from the other end
of the connection. If specified as True (the default), it returns a
normal EOF in response to unexpected EOF errors raised from the underlying
socket; if False, it will raise the exceptions back to the caller.
.. versionchanged:: 2.7
New optional argument {ciphers}.
RAND_status()~
Returns True if the SSL pseudo-random number generator has been seeded with
'enough' randomness, and False otherwise. You can use ssl.RAND_egd
and ssl.RAND_add to increase the randomness of the pseudo-random
number generator.
RAND_egd(path)~
If you are running an entropy-gathering daemon (EGD) somewhere, and ``path``
is the pathname of a socket connection open to it, this will read 256 bytes
of randomness from the socket, and add it to the SSL pseudo-random number
generator to increase the security of generated secret keys. This is
typically only necessary on systems without better sources of randomness.
See http://egd.sourceforge.net/ or http://prngd.sourceforge.net/ for sources
of entropy-gathering daemons.
RAND_add(bytes, entropy)~
Mixes the given ``bytes`` into the SSL pseudo-random number generator. The
parameter ``entropy`` (a float) is a lower bound on the entropy contained in
string (so you can always use 0.0). See 1750 for more
information on sources of entropy.
cert_time_to_seconds(timestring)~
Returns a floating-point value containing a normal seconds-after-the-epoch
time value, given the time-string representing the "notBefore" or "notAfter"
date from a certificate.
Here's an example:: >
>>> import ssl
>>> ssl.cert_time_to_seconds("May 9 00:00:00 2007 GMT")
1178694000.0
>>> import time
>>> time.ctime(ssl.cert_time_to_seconds("May 9 00:00:00 2007 GMT"))
'Wed May 9 00:00:00 2007'
>>>
<
get_server_certificate (addr, ssl_version=PROTOCOL_SSLv3, ca_certs=None)~
Given the address ``addr`` of an SSL-protected server, as a ({hostname},
{port-number}) pair, fetches the server's certificate, and returns it as a
PEM-encoded string. If ``ssl_version`` is specified, uses that version of
the SSL protocol to attempt to connect to the server. If ``ca_certs`` is
specified, it should be a file containing a list of root certificates, the
same format as used for the same parameter in wrap_socket. The call
will attempt to validate the server certificate against that set of root
certificates, and will fail if the validation attempt fails.
DER_cert_to_PEM_cert (DER_cert_bytes)~
Given a certificate as a DER-encoded blob of bytes, returns a PEM-encoded
string version of the same certificate.
PEM_cert_to_DER_cert (PEM_cert_string)~
Given a certificate as an ASCII PEM string, returns a DER-encoded sequence of
bytes for that same certificate.
CERT_NONE~
Value to pass to the ``cert_reqs`` parameter to sslobject when no
certificates will be required or validated from the other side of the socket
connection.
CERT_OPTIONAL~
Value to pass to the ``cert_reqs`` parameter to sslobject when no
certificates will be required from the other side of the socket connection,
but if they are provided, will be validated. Note that use of this setting
requires a valid certificate validation file also be passed as a value of the
``ca_certs`` parameter.
CERT_REQUIRED~
Value to pass to the ``cert_reqs`` parameter to sslobject when
certificates will be required from the other side of the socket connection.
Note that use of this setting requires a valid certificate validation file
also be passed as a value of the ``ca_certs`` parameter.
PROTOCOL_SSLv2~
Selects SSL version 2 as the channel encryption protocol.
.. warning:: >
SSL version 2 is insecure. Its use is highly discouraged.
<
PROTOCOL_SSLv23~
Selects SSL version 2 or 3 as the channel encryption protocol. This is a
setting to use with servers for maximum compatibility with the other end of
an SSL connection, but it may cause the specific ciphers chosen for the
encryption to be of fairly low quality.
PROTOCOL_SSLv3~
Selects SSL version 3 as the channel encryption protocol. For clients, this
is the maximally compatible SSL variant.
PROTOCOL_TLSv1~
Selects TLS version 1 as the channel encryption protocol. This is the most
modern version, and probably the best choice for maximum protection, if both
sides can speak it.
OPENSSL_VERSION~
The version string of the OpenSSL library loaded by the interpreter:: >
>>> ssl.OPENSSL_VERSION
'OpenSSL 0.9.8k 25 Mar 2009'
<
.. versionadded:: 2.7
OPENSSL_VERSION_INFO~
A tuple of five integers representing version information about the
OpenSSL library:: >
>>> ssl.OPENSSL_VERSION_INFO
(0, 9, 8, 11, 15)
<
.. versionadded:: 2.7
OPENSSL_VERSION_NUMBER~
The raw version number of the OpenSSL library, as a single integer:: >
>>> ssl.OPENSSL_VERSION_NUMBER
9470143L
>>> hex(ssl.OPENSSL_VERSION_NUMBER)
'0x9080bfL'
<
.. versionadded:: 2.7
SSLSocket Objects
-----------------
SSLSocket.read([nbytes=1024])~
Reads up to ``nbytes`` bytes from the SSL-encrypted channel and returns them.
SSLSocket.write(data)~
Writes the ``data`` to the other side of the connection, using the SSL
channel to encrypt. Returns the number of bytes written.
SSLSocket.getpeercert(binary_form=False)~
If there is no certificate for the peer on the other end of the connection,
returns ``None``.
If the parameter ``binary_form`` is False, and a certificate was
received from the peer, this method returns a dict instance. If the
certificate was not validated, the dict is empty. If the certificate was
validated, it returns a dict with the keys ``subject`` (the principal for
which the certificate was issued), and ``notAfter`` (the time after which the
certificate should not be trusted). The certificate was already validated,
so the ``notBefore`` and ``issuer`` fields are not returned. If a
certificate contains an instance of the {Subject Alternative Name} extension
(see 3280), there will also be a ``subjectAltName`` key in the
dictionary.
The "subject" field is a tuple containing the sequence of relative
distinguished names (RDNs) given in the certificate's data structure for the
principal, and each RDN is a sequence of name-value pairs:: >
{'notAfter': 'Feb 16 16:54:50 2013 GMT',
'subject': ((('countryName', u'US'),),
(('stateOrProvinceName', u'Delaware'),),
(('localityName', u'Wilmington'),),
(('organizationName', u'Python Software Foundation'),),
(('organizationalUnitName', u'SSL'),),
(('commonName', u'somemachine.python.org'),))}
<
If the ``binary_form`` parameter is True, and a certificate was
provided, this method returns the DER-encoded form of the entire certificate
as a sequence of bytes, or None if the peer did not provide a
certificate. This return value is independent of validation; if validation
was required (CERT_OPTIONAL or CERT_REQUIRED), it will have
been validated, but if CERT_NONE was used to establish the
connection, the certificate, if present, will not have been validated.
SSLSocket.cipher()~
Returns a three-value tuple containing the name of the cipher being used, the
version of the SSL protocol that defines its use, and the number of secret
bits being used. If no connection has been established, returns ``None``.
SSLSocket.do_handshake()~
Perform a TLS/SSL handshake. If this is used with a non-blocking socket, it
may raise SSLError with an ``arg[0]`` of SSL_ERROR_WANT_READ
or SSL_ERROR_WANT_WRITE, in which case it must be called again until
it completes successfully. For example, to simulate the behavior of a
blocking socket, one might write:: >
while True:
try:
s.do_handshake()
break
except ssl.SSLError, err:
if err.args[0] == ssl.SSL_ERROR_WANT_READ:
select.select([s], [], [])
elif err.args[0] == ssl.SSL_ERROR_WANT_WRITE:
select.select([], [s], [])
else:
raise
<
SSLSocket.unwrap()~
Performs the SSL shutdown handshake, which removes the TLS layer from the
underlying socket, and returns the underlying socket object. This can be
used to go from encrypted operation over a connection to unencrypted. The
socket instance returned should always be used for further communication with
the other side of the connection, rather than the original socket instance
(which may not function properly after the unwrap).
.. index:: single: certificates
.. index:: single: X509 certificate
Certificates
------------
Certificates in general are part of a public-key / private-key system. In this
system, each {principal}, (which may be a machine, or a person, or an
organization) is assigned a unique two-part encryption key. One part of the key
is public, and is called the {public key}; the other part is kept secret, and is
called the {private key}. The two parts are related, in that if you encrypt a
message with one of the parts, you can decrypt it with the other part, and
{only}* with the other part.
A certificate contains information about two principals. It contains the name
of a {subject}, and the subject's public key. It also contains a statement by a
second principal, the {issuer}, that the subject is who he claims to be, and
that this is indeed the subject's public key. The issuer's statement is signed
with the issuer's private key, which only the issuer knows. However, anyone can
verify the issuer's statement by finding the issuer's public key, decrypting the
statement with it, and comparing it to the other information in the certificate.
The certificate also contains information about the time period over which it is
valid. This is expressed as two fields, called "notBefore" and "notAfter".
In the Python use of certificates, a client or server can use a certificate to
prove who they are. The other side of a network connection can also be required
to produce a certificate, and that certificate can be validated to the
satisfaction of the client or server that requires such validation. The
connection attempt can be set to raise an exception if the validation fails.
Validation is done automatically, by the underlying OpenSSL framework; the
application need not concern itself with its mechanics. But the application
does usually need to provide sets of certificates to allow this process to take
place.
Python uses files to contain certificates. They should be formatted as "PEM"
(see 1422), which is a base-64 encoded form wrapped with a header line
and a footer line:: >
-----BEGIN CERTIFICATE-----
... (certificate in base64 PEM encoding) ...
-----END CERTIFICATE-----
<
The Python files which contain certificates can contain a sequence of
certificates, sometimes called a {certificate chain}. This chain should start
with the specific certificate for the principal who "is" the client or server,
and then the certificate for the issuer of that certificate, and then the
certificate for the issuer of {that} certificate, and so on up the chain till
you get to a certificate which is {self-signed}, that is, a certificate which
has the same subject and issuer, sometimes called a {root certificate}. The
certificates should just be concatenated together in the certificate file. For
example, suppose we had a three certificate chain, from our server certificate
to the certificate of the certification authority that signed our server
certificate, to the root certificate of the agency which issued the
certification authority's certificate:: >
-----BEGIN CERTIFICATE-----
... (certificate for your server)...
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
... (the certificate for the CA)...
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
... (the root certificate for the CA's issuer)...
-----END CERTIFICATE-----
<
If you are going to require validation of the other side of the connection's
certificate, you need to provide a "CA certs" file, filled with the certificate
chains for each issuer you are willing to trust. Again, this file just contains
these chains concatenated together. For validation, Python will use the first
chain it finds in the file which matches.
Some "standard" root certificates are available from various certification
authorities: `CACert.org <http://www.cacert.org/index.php?id=3>`_, `Thawte
<http://www.thawte.com/roots/>`_, `Verisign
<http://www.verisign.com/support/roots.html>`_, `Positive SSL
<http://www.PositiveSSL.com/ssl-certificate-support/cert_installation/UTN-USERFirst-Hardware.crt>`_
(used by python.org), `Equifax and GeoTrust
<http://www.geotrust.com/resources/root_certificates/index.asp>`_.
In general, if you are using SSL3 or TLS1, you don't need to put the full chain
in your "CA certs" file; you only need the root certificates, and the remote
peer is supposed to furnish the other certificates necessary to chain from its
certificate to a root certificate. See 4158 for more discussion of the
way in which certification chains can be built.
If you are going to create a server that provides SSL-encrypted connection
services, you will need to acquire a certificate for that service. There are
many ways of acquiring appropriate certificates, such as buying one from a
certification authority. Another common practice is to generate a self-signed
certificate. The simplest way to do this is with the OpenSSL package, using
something like the following:: >
% openssl req -new -x509 -days 365 -nodes -out cert.pem -keyout cert.pem
Generating a 1024 bit RSA private key
.......++++++
.............................++++++
writing new private key to 'cert.pem'
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
Country Name (2 letter code) [AU]:US
State or Province Name (full name) [Some-State]:MyState
Locality Name (eg, city) []:Some City
Organization Name (eg, company) [Internet Widgits Pty Ltd]:My Organization, Inc.
Organizational Unit Name (eg, section) []:My Group
Common Name (eg, YOUR name) []:myserver.mygroup.myorganization.com
Email Address []:ops@myserver.mygroup.myorganization.com
%
<
The disadvantage of a self-signed certificate is that it is its own root
certificate, and no one else will have it in their cache of known (and trusted)
root certificates.
Examples
--------
Testing for SSL support
^^^^^^^^^^^^^^^^^^^^^^^
To test for the presence of SSL support in a Python installation, user code
should use the following idiom:: >
try:
import ssl
except ImportError:
pass
else:
[ do something that requires SSL support ]
<
Client-side operation
This example connects to an SSL server, prints the server's address and
certificate, sends some bytes, and reads part of the response:: >
import socket, ssl, pprint
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# require a certificate from the server
ssl_sock = ssl.wrap_socket(s,
ca_certs="/etc/ca_certs_file",
cert_reqs=ssl.CERT_REQUIRED)
ssl_sock.connect(('www.verisign.com', 443))
print repr(ssl_sock.getpeername())
print ssl_sock.cipher()
print pprint.pformat(ssl_sock.getpeercert())
# Set a simple HTTP request -- use httplib in actual code.
ssl_sock.write("""GET / HTTP/1.0\r
Host: www.verisign.com\r\n\r\n""")
# Read a chunk of data. Will not necessarily
# read all the data returned by the server.
data = ssl_sock.read()
# note that closing the SSLSocket will also close the underlying socket
ssl_sock.close()
<
As of September 6, 2007, the certificate printed by this program looked like
this:: >
{'notAfter': 'May 8 23:59:59 2009 GMT',
'subject': ((('serialNumber', u'2497886'),),
(('1.3.6.1.4.1.311.60.2.1.3', u'US'),),
(('1.3.6.1.4.1.311.60.2.1.2', u'Delaware'),),
(('countryName', u'US'),),
(('postalCode', u'94043'),),
(('stateOrProvinceName', u'California'),),
(('localityName', u'Mountain View'),),
(('streetAddress', u'487 East Middlefield Road'),),
(('organizationName', u'VeriSign, Inc.'),),
(('organizationalUnitName',
u'Production Security Services'),),
(('organizationalUnitName',
u'Terms of use at www.verisign.com/rpa (c)06'),),
(('commonName', u'www.verisign.com'),))}
<
which is a fairly poorly-formed ``subject`` field.
Server-side operation
^^^^^^^^^^^^^^^^^^^^^
For server operation, typically you'd need to have a server certificate, and
private key, each in a file. You'd open a socket, bind it to a port, call
listen on it, then start waiting for clients to connect:: >
import socket, ssl
bindsocket = socket.socket()
bindsocket.bind(('myaddr.mydomain.com', 10023))
bindsocket.listen(5)
<
When one did, you'd call accept on the socket to get the new socket from
the other end, and use wrap_socket to create a server-side SSL context
for it:: >
while True:
newsocket, fromaddr = bindsocket.accept()
connstream = ssl.wrap_socket(newsocket,
server_side=True,
certfile="mycertfile",
keyfile="mykeyfile",
ssl_version=ssl.PROTOCOL_TLSv1)
deal_with_client(connstream)
<
Then you'd read data from the ``connstream`` and do something with it till you
are finished with the client (or the client is finished with you):: >
def deal_with_client(connstream):
data = connstream.read()
# null data means the client is finished with us
while data:
if not do_something(connstream, data):
# we'll assume do_something returns False
# when we're finished with client
break
data = connstream.read()
# finished with client
connstream.close()
<
And go back to listening for new client connections.
.. seealso::
Class socket.socket
Documentation of underlying socket (|py2stdlib-socket|) class
`Introducing SSL and Certificates using OpenSSL <http://old.pseudonym.org/ssl/wwwj-index.html>`_
Frederick J. Hirsch
`RFC 1422: Privacy Enhancement for Internet Electronic Mail: Part II: Certificate-Based Key Management <http://www.ietf.org/rfc/rfc1422>`_
Steve Kent
`RFC 1750: Randomness Recommendations for Security <http://www.ietf.org/rfc/rfc1750>`_
D. Eastlake et. al.
`RFC 3280: Internet X.509 Public Key Infrastructure Certificate and CRL Profile <http://www.ietf.org/rfc/rfc3280>`_
Housley et. al.
==============================================================================
*py2stdlib-stat*
stat~
:synopsis: Utilities for interpreting the results of os.stat(), os.lstat() and os.fstat().
The stat (|py2stdlib-stat|) module defines constants and functions for interpreting the
results of os.stat, os.fstat and os.lstat (if they
exist). For complete details about the stat (|py2stdlib-stat|), fstat and
lstat calls, consult the documentation for your system.
The stat (|py2stdlib-stat|) module defines the following functions to test for specific file
types:
S_ISDIR(mode)~
Return non-zero if the mode is from a directory.
S_ISCHR(mode)~
Return non-zero if the mode is from a character special device file.
S_ISBLK(mode)~
Return non-zero if the mode is from a block special device file.
S_ISREG(mode)~
Return non-zero if the mode is from a regular file.
S_ISFIFO(mode)~
Return non-zero if the mode is from a FIFO (named pipe).
S_ISLNK(mode)~
Return non-zero if the mode is from a symbolic link.
S_ISSOCK(mode)~
Return non-zero if the mode is from a socket.
Two additional functions are defined for more general manipulation of the file's
mode:
S_IMODE(mode)~
Return the portion of the file's mode that can be set by os.chmod\
---that is, the file's permission bits, plus the sticky bit, set-group-id, and
set-user-id bits (on systems that support them).
S_IFMT(mode)~
Return the portion of the file's mode that describes the file type (used by the
S_IS\* functions above).
Normally, you would use the os.path.is\* functions for testing the type
of a file; the functions here are useful when you are doing multiple tests of
the same file and wish to avoid the overhead of the stat (|py2stdlib-stat|) system call
for each test. These are also useful when checking for information about a file
that isn't handled by os.path (|py2stdlib-os.path|), like the tests for block and character
devices.
All the variables below are simply symbolic indexes into the 10-tuple returned
by os.stat, os.fstat or os.lstat.
ST_MODE~
Inode protection mode.
ST_INO~
Inode number.
ST_DEV~
Device inode resides on.
ST_NLINK~
Number of links to the inode.
ST_UID~
User id of the owner.
ST_GID~
Group id of the owner.
ST_SIZE~
Size in bytes of a plain file; amount of data waiting on some special files.
ST_ATIME~
Time of last access.
ST_MTIME~
Time of last modification.
ST_CTIME~
The "ctime" as reported by the operating system. On some systems (like Unix) is
the time of the last metadata change, and, on others (like Windows), is the
creation time (see platform documentation for details).
The interpretation of "file size" changes according to the file type. For plain
files this is the size of the file in bytes. For FIFOs and sockets under most
flavors of Unix (including Linux in particular), the "size" is the number of
bytes waiting to be read at the time of the call to os.stat,
os.fstat, or os.lstat; this can sometimes be useful, especially
for polling one of these special files after a non-blocking open. The meaning
of the size field for other character and block devices varies more, depending
on the implementation of the underlying system call.
The variables below define the flags used in the ST_MODE field.
Use of the functions above is more portable than use of the first set of flags:
S_IFMT~
Bit mask for the file type bit fields.
S_IFSOCK~
Socket.
S_IFLNK~
Symbolic link.
S_IFREG~
Regular file.
S_IFBLK~
Block device.
S_IFDIR~
Directory.
S_IFCHR~
Character device.
S_IFIFO~
FIFO.
The following flags can also be used in the {mode} argument of os.chmod:
S_ISUID~
Set UID bit.
S_ISGID~
Set-group-ID bit. This bit has several special uses. For a directory
it indicates that BSD semantics is to be used for that directory:
files created there inherit their group ID from the directory, not
from the effective group ID of the creating process, and directories
created there will also get the S_ISGID bit set. For a
file that does not have the group execution bit (S_IXGRP)
set, the set-group-ID bit indicates mandatory file/record locking
(see also S_ENFMT).
S_ISVTX~
Sticky bit. When this bit is set on a directory it means that a file
in that directory can be renamed or deleted only by the owner of the
file, by the owner of the directory, or by a privileged process.
S_IRWXU~
Mask for file owner permissions.
S_IRUSR~
Owner has read permission.
S_IWUSR~
Owner has write permission.
S_IXUSR~
Owner has execute permission.
S_IRWXG~
Mask for group permissions.
S_IRGRP~
Group has read permission.
S_IWGRP~
Group has write permission.
S_IXGRP~
Group has execute permission.
S_IRWXO~
Mask for permissions for others (not in group).
S_IROTH~
Others have read permission.
S_IWOTH~
Others have write permission.
S_IXOTH~
Others have execute permission.
S_ENFMT~
System V file locking enforcement. This flag is shared with S_ISGID:
file/record locking is enforced on files that do not have the group
execution bit (S_IXGRP) set.
S_IREAD~
Unix V7 synonym for S_IRUSR.
S_IWRITE~
Unix V7 synonym for S_IWUSR.
S_IEXEC~
Unix V7 synonym for S_IXUSR.
Example:: >
import os, sys
from stat import *
def walktree(top, callback):
'''recursively descend the directory tree rooted at top,
calling the callback function for each regular file'''
for f in os.listdir(top):
pathname = os.path.join(top, f)
mode = os.stat(pathname)[ST_MODE]
if S_ISDIR(mode):
# It's a directory, recurse into it
walktree(pathname, callback)
elif S_ISREG(mode):
# It's a file, call the callback function
callback(pathname)
else:
# Unknown file type, print a message
print 'Skipping %s' % pathname
def visitfile(file):
print 'visiting', file
if __name__ == '__main__':
walktree(sys.argv[1], visitfile)
==============================================================================
*py2stdlib-statvfs*
statvfs~
:synopsis: Constants for interpreting the result of os.statvfs().
:deprecated:
2.6~
The statvfs (|py2stdlib-statvfs|) module has been deprecated for removal in Python 3.0.
The statvfs (|py2stdlib-statvfs|) module defines constants so interpreting the result if
os.statvfs, which returns a tuple, can be made without remembering
"magic numbers." Each of the constants defined in this module is the {index} of
the entry in the tuple returned by os.statvfs that contains the
specified information.
F_BSIZE~
Preferred file system block size.
F_FRSIZE~
Fundamental file system block size.
F_BLOCKS~
Total number of blocks in the filesystem.
F_BFREE~
Total number of free blocks.
F_BAVAIL~
Free blocks available to non-super user.
F_FILES~
Total number of file nodes.
F_FFREE~
Total number of free file nodes.
F_FAVAIL~
Free nodes available to non-super user.
F_FLAG~
Flags. System dependent: see statvfs (|py2stdlib-statvfs|) man page.
F_NAMEMAX~
Maximum file name length.
==============================================================================
*py2stdlib-string*
string~
:synopsis: Common string operations.
.. index:: module: re
The string (|py2stdlib-string|) module contains a number of useful constants and
classes, as well as some deprecated legacy functions that are also
available as methods on strings. In addition, Python's built-in string
classes support the sequence type methods described in the
typesseq section, and also the string-specific methods described
in the string-methods section. To output formatted strings use
template strings or the ``%`` operator described in the
string-formatting section. Also, see the re (|py2stdlib-re|) module for
string functions based on regular expressions.
String constants
----------------
The constants defined in this module are:
ascii_letters~
The concatenation of the ascii_lowercase and ascii_uppercase
constants described below. This value is not locale-dependent.
ascii_lowercase~
The lowercase letters ``'abcdefghijklmnopqrstuvwxyz'``. This value is not
locale-dependent and will not change.
ascii_uppercase~
The uppercase letters ``'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``. This value is not
locale-dependent and will not change.
digits~
The string ``'0123456789'``.
hexdigits~
The string ``'0123456789abcdefABCDEF'``.
letters~
The concatenation of the strings lowercase and uppercase
described below. The specific value is locale-dependent, and will be updated
when locale.setlocale is called.
lowercase~
A string containing all the characters that are considered lowercase letters.
On most systems this is the string ``'abcdefghijklmnopqrstuvwxyz'``. The
specific value is locale-dependent, and will be updated when
locale.setlocale is called.
octdigits~
The string ``'01234567'``.
punctuation~
String of ASCII characters which are considered punctuation characters in the
``C`` locale.
printable~
String of characters which are considered printable. This is a combination of
digits, letters, punctuation, and
whitespace.
uppercase~
A string containing all the characters that are considered uppercase letters.
On most systems this is the string ``'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``. The
specific value is locale-dependent, and will be updated when
locale.setlocale is called.
whitespace~
A string containing all characters that are considered whitespace. On most
systems this includes the characters space, tab, linefeed, return, formfeed, and
vertical tab.
String Formatting
-----------------
.. versionadded:: 2.6
The built-in str and unicode classes provide the ability
to do complex variable substitutions and value formatting via the
str.format method described in 3101. The Formatter
class in the string (|py2stdlib-string|) module allows you to create and customize your own
string formatting behaviors using the same implementation as the built-in
format method.
Formatter~
The Formatter class has the following public methods:
format(format_string, {args, }kwargs)~
format is the primary API method. It takes a format template
string, and an arbitrary set of positional and keyword argument.
format is just a wrapper that calls vformat.
vformat(format_string, args, kwargs)~
This function does the actual work of formatting. It is exposed as a
separate function for cases where you want to pass in a predefined
dictionary of arguments, rather than unpacking and repacking the
dictionary as individual arguments using the ``{args`` and ``}*kwds``
syntax. vformat does the work of breaking up the format template
string into character data and replacement fields. It calls the various
methods described below.
In addition, the Formatter defines a number of methods that are
intended to be replaced by subclasses:
parse(format_string)~
Loop over the format_string and return an iterable of tuples
({literal_text}, {field_name}, {format_spec}, {conversion}). This is used
by vformat to break the string in to either literal text, or
replacement fields.
The values in the tuple conceptually represent a span of literal text
followed by a single replacement field. If there is no literal text
(which can happen if two replacement fields occur consecutively), then
{literal_text} will be a zero-length string. If there is no replacement
field, then the values of {field_name}, {format_spec} and {conversion}
will be ``None``.
get_field(field_name, args, kwargs)~
Given {field_name} as returned by parse (see above), convert it to
an object to be formatted. Returns a tuple (obj, used_key). The default
version takes strings of the form defined in 3101, such as
"0[name]" or "label.title". {args} and {kwargs} are as passed in to
vformat. The return value {used_key} has the same meaning as the
{key} parameter to get_value.
get_value(key, args, kwargs)~
Retrieve a given field value. The {key} argument will be either an
integer or a string. If it is an integer, it represents the index of the
positional argument in {args}; if it is a string, then it represents a
named argument in {kwargs}.
The {args} parameter is set to the list of positional arguments to
vformat, and the {kwargs} parameter is set to the dictionary of
keyword arguments.
For compound field names, these functions are only called for the first
component of the field name; Subsequent components are handled through
normal attribute and indexing operations.
So for example, the field expression '0.name' would cause
get_value to be called with a {key} argument of 0. The ``name``
attribute will be looked up after get_value returns by calling the
built-in getattr function.
If the index or keyword refers to an item that does not exist, then an
IndexError or KeyError should be raised.
check_unused_args(used_args, args, kwargs)~
Implement checking for unused arguments if desired. The arguments to this
function is the set of all argument keys that were actually referred to in
the format string (integers for positional arguments, and strings for
named arguments), and a reference to the {args} and {kwargs} that was
passed to vformat. The set of unused args can be calculated from these
parameters. check_unused_args is assumed to throw an exception if
the check fails.
format_field(value, format_spec)~
format_field simply calls the global format built-in. The
method is provided so that subclasses can override it.
convert_field(value, conversion)~
Converts the value (returned by get_field) given a conversion type
(as in the tuple returned by the parse method). The default
version understands 'r' (repr) and 's' (str) conversion types.
Format String Syntax
--------------------
The str.format method and the Formatter class share the same
syntax for format strings (although in the case of Formatter,
subclasses can define their own format string syntax).
Format strings contain "replacement fields" surrounded by curly braces ``{}``.
Anything that is not contained in braces is considered literal text, which is
copied unchanged to the output. If you need to include a brace character in the
literal text, it can be escaped by doubling: ``{{`` and ``}}``.
The grammar for a replacement field is as follows:
.. productionlist:: sf
replacement_field: "{" [`field_name`] ["!" `conversion`] [":" `format_spec`] "}"
field_name: arg_name ("." `attribute_name` | "[" `element_index` "]")*
arg_name: [`identifier` | `integer`]
attribute_name: `identifier`
element_index: `integer` | `index_string`
index_string: <any source character except "]"> +
conversion: "r" | "s"
format_spec: <described in the next section>
In less formal terms, the replacement field can start with a {field_name} that specifies
the object whose value is to be formatted and inserted
into the output instead of the replacement field.
The {field_name} is optionally followed by a {conversion} field, which is
preceded by an exclamation point ``'!'``, and a {format_spec}, which is preceded
by a colon ``':'``. These specify a non-default format for the replacement value.
See also the formatspec section.
The {field_name} itself begins with an {arg_name} that is either either a number or a
keyword. If it's a number, it refers to a positional argument, and if it's a keyword,
it refers to a named keyword argument. If the numerical arg_names in a format string
are 0, 1, 2, ... in sequence, they can all be omitted (not just some)
and the numbers 0, 1, 2, ... will be automatically inserted in that order.
The {arg_name} can be followed by any number of index or
attribute expressions. An expression of the form ``'.name'`` selects the named
attribute using getattr, while an expression of the form ``'[index]'``
does an index lookup using __getitem__.
.. versionchanged:: 2.7
The positional argument specifiers can be omitted, so ``'{} {}'`` is
equivalent to ``'{0} {1}'``.
Some simple format string examples:: >
"First, thou shalt count to {0}" # References first positional argument
"Bring me a {}" # Implicitly references the first positional argument
"From {} to {}" # Same as "From {0} to {1}"
"My quest is {name}" # References keyword argument 'name'
"Weight in tons {0.weight}" # 'weight' attribute of first positional arg
"Units destroyed: {players[0]}" # First element of keyword argument 'players'.
<
The {conversion} field causes a type coercion before formatting. Normally, the
job of formatting a value is done by the __format__ method of the value
itself. However, in some cases it is desirable to force a type to be formatted
as a string, overriding its own definition of formatting. By converting the
value to a string before calling __format__, the normal formatting logic
is bypassed.
Two conversion flags are currently supported: ``'!s'`` which calls str
on the value, and ``'!r'`` which calls repr (|py2stdlib-repr|).
Some examples:: >
"Harold's a clever {0!s}" # Calls str() on the argument first
"Bring out the holy {name!r}" # Calls repr() on the argument first
<
The {format_spec} field contains a specification of how the value should be
presented, including such details as field width, alignment, padding, decimal
precision and so on. Each value type can define its own "formatting
mini-language" or interpretation of the {format_spec}.
Most built-in types support a common formatting mini-language, which is
described in the next section.
A {format_spec} field can also include nested replacement fields within it.
These nested replacement fields can contain only a field name; conversion flags
and format specifications are not allowed. The replacement fields within the
format_spec are substituted before the {format_spec} string is interpreted.
This allows the formatting of a value to be dynamically specified.
See the formatexamples section for some examples.
Format Specification Mini-Language
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
"Format specifications" are used within replacement fields contained within a
format string to define how individual values are presented (see
formatstrings). They can also be passed directly to the built-in
format function. Each formattable type may define how the format
specification is to be interpreted.
Most built-in types implement the following options for format specifications,
although some of the formatting options are only supported by the numeric types.
A general convention is that an empty format string (``""``) produces
the same result as if you had called str on the value. A
non-empty format string typically modifies the result.
The general form of a {standard format specifier} is:
.. productionlist:: sf
format_spec: [[`fill`]`align`][`sign`][#][0][`width`][,][.`precision`][`type`]
fill: <a character other than '}'>
align: "<" | ">" | "=" | "^"
sign: "+" | "-" | " "
width: `integer`
precision: `integer`
type: "b" | "c" | "d" | "e" | "E" | "f" | "F" | "g" | "G" | "n" | "o" | "s" | "x" | "X" | "%"
The {fill} character can be any character other than '}' (which signifies the
end of the field). The presence of a fill character is signaled by the {next}
character, which must be one of the alignment options. If the second character
of {format_spec} is not a valid alignment option, then it is assumed that both
the fill character and the alignment option are absent.
The meaning of the various alignment options is as follows:
+---------+----------------------------------------------------------+
| Option | Meaning |
+=========+==========================================================+
| ``'<'`` | Forces the field to be left-aligned within the available |
| | space (this is the default). |
+---------+----------------------------------------------------------+
| ``'>'`` | Forces the field to be right-aligned within the |
| | available space. |
+---------+----------------------------------------------------------+
| ``'='`` | Forces the padding to be placed after the sign (if any) |
| | but before the digits. This is used for printing fields |
| | in the form '+000000120'. This alignment option is only |
| | valid for numeric types. |
+---------+----------------------------------------------------------+
| ``'^'`` | Forces the field to be centered within the available |
| | space. |
+---------+----------------------------------------------------------+
Note that unless a minimum field width is defined, the field width will always
be the same size as the data to fill it, so that the alignment option has no
meaning in this case.
The {sign} option is only valid for number types, and can be one of the
following:
+---------+----------------------------------------------------------+
| Option | Meaning |
+=========+==========================================================+
| ``'+'`` | indicates that a sign should be used for both |
| | positive as well as negative numbers. |
+---------+----------------------------------------------------------+
| ``'-'`` | indicates that a sign should be used only for negative |
| | numbers (this is the default behavior). |
+---------+----------------------------------------------------------+
| space | indicates that a leading space should be used on |
| | positive numbers, and a minus sign on negative numbers. |
+---------+----------------------------------------------------------+
The ``'#'`` option is only valid for integers, and only for binary, octal, or
hexadecimal output. If present, it specifies that the output will be prefixed
by ``'0b'``, ``'0o'``, or ``'0x'``, respectively.
The ``','`` option signals the use of a comma for a thousands separator.
For a locale aware separator, use the ``'n'`` integer presentation type
instead.
.. versionchanged:: 2.7
Added the ``','`` option (see also 378).
{width} is a decimal integer defining the minimum field width. If not
specified, then the field width will be determined by the content.
If the {width} field is preceded by a zero (``'0'``) character, this enables
zero-padding. This is equivalent to an {alignment} type of ``'='`` and a {fill}
character of ``'0'``.
The {precision} is a decimal number indicating how many digits should be
displayed after the decimal point for a floating point value formatted with
``'f'`` and ``'F'``, or before and after the decimal point for a floating point
value formatted with ``'g'`` or ``'G'``. For non-number types the field
indicates the maximum field size - in other words, how many characters will be
used from the field content. The {precision} is not allowed for integer values.
Finally, the {type} determines how the data should be presented.
The available string presentation types are:
+---------+----------------------------------------------------------+
| Type | Meaning |
+=========+==========================================================+
| ``'s'`` | String format. This is the default type for strings and |
| | may be omitted. |
+---------+----------------------------------------------------------+
| None | The same as ``'s'``. |
+---------+----------------------------------------------------------+
The available integer presentation types are:
+---------+----------------------------------------------------------+
| Type | Meaning |
+=========+==========================================================+
| ``'b'`` | Binary format. Outputs the number in base 2. |
+---------+----------------------------------------------------------+
| ``'c'`` | Character. Converts the integer to the corresponding |
| | unicode character before printing. |
+---------+----------------------------------------------------------+
| ``'d'`` | Decimal Integer. Outputs the number in base 10. |
+---------+----------------------------------------------------------+
| ``'o'`` | Octal format. Outputs the number in base 8. |
+---------+----------------------------------------------------------+
| ``'x'`` | Hex format. Outputs the number in base 16, using lower- |
| | case letters for the digits above 9. |
+---------+----------------------------------------------------------+
| ``'X'`` | Hex format. Outputs the number in base 16, using upper- |
| | case letters for the digits above 9. |
+---------+----------------------------------------------------------+
| ``'n'`` | Number. This is the same as ``'d'``, except that it uses |
| | the current locale setting to insert the appropriate |
| | number separator characters. |
+---------+----------------------------------------------------------+
| None | The same as ``'d'``. |
+---------+----------------------------------------------------------+
In addition to the above presentation types, integers can be formatted
with the floating point presentation types listed below (except
``'n'`` and None). When doing so, float is used to convert the
integer to a floating point number before formatting.
The available presentation types for floating point and decimal values are:
+---------+----------------------------------------------------------+
| Type | Meaning |
+=========+==========================================================+
| ``'e'`` | Exponent notation. Prints the number in scientific |
| | notation using the letter 'e' to indicate the exponent. |
+---------+----------------------------------------------------------+
| ``'E'`` | Exponent notation. Same as ``'e'`` except it uses an |
| | upper case 'E' as the separator character. |
+---------+----------------------------------------------------------+
| ``'f'`` | Fixed point. Displays the number as a fixed-point |
| | number. |
+---------+----------------------------------------------------------+
| ``'F'`` | Fixed point. Same as ``'f'``. |
+---------+----------------------------------------------------------+
| ``'g'`` | General format. For a given precision ``p >= 1``, |
| | this rounds the number to ``p`` significant digits and |
| | then formats the result in either fixed-point format |
| | or in scientific notation, depending on its magnitude. |
| | |
| | The precise rules are as follows: suppose that the |
| | result formatted with presentation type ``'e'`` and |
| | precision ``p-1`` would have exponent ``exp``. Then |
| | if ``-4 <= exp < p``, the number is formatted |
| | with presentation type ``'f'`` and precision |
| | ``p-1-exp``. Otherwise, the number is formatted |
| | with presentation type ``'e'`` and precision ``p-1``. |
| | In both cases insignificant trailing zeros are removed |
| | from the significand, and the decimal point is also |
| | removed if there are no remaining digits following it. |
| | |
| | Postive and negative infinity, positive and negative |
| | zero, and nans, are formatted as ``inf``, ``-inf``, |
| | ``0``, ``-0`` and ``nan`` respectively, regardless of |
| | the precision. |
| | |
| | A precision of ``0`` is treated as equivalent to a |
| | precision of ``1``. |
+---------+----------------------------------------------------------+
| ``'G'`` | General format. Same as ``'g'`` except switches to |
| | ``'E'`` if the number gets too large. The |
| | representations of infinity and NaN are uppercased, too. |
+---------+----------------------------------------------------------+
| ``'n'`` | Number. This is the same as ``'g'``, except that it uses |
| | the current locale setting to insert the appropriate |
| | number separator characters. |
+---------+----------------------------------------------------------+
| ``'%'`` | Percentage. Multiplies the number by 100 and displays |
| | in fixed (``'f'``) format, followed by a percent sign. |
+---------+----------------------------------------------------------+
| None | The same as ``'g'``. |
+---------+----------------------------------------------------------+
Format examples
^^^^^^^^^^^^^^^
This section contains examples of the new format syntax and comparison with
the old ``%``-formatting.
In most of the cases the syntax is similar to the old ``%``-formatting, with the
addition of the ``{}`` and with ``:`` used instead of ``%``.
For example, ``'%03.2f'`` can be translated to ``'{:03.2f}'``.
The new format syntax also supports new and different options, shown in the
follow examples.
Accessing arguments by position:: >
>>> '{0}, {1}, {2}'.format('a', 'b', 'c')
'a, b, c'
>>> '{}, {}, {}'.format('a', 'b', 'c') # 2.7+ only
'a, b, c'
>>> '{2}, {1}, {0}'.format('a', 'b', 'c')
'c, b, a'
>>> '{2}, {1}, {0}'.format(*'abc') # unpacking argument sequence
'c, b, a'
>>> '{0}{1}{0}'.format('abra', 'cad') # arguments' indices can be repeated
'abracadabra'
<
Accessing arguments by name::
>>> 'Coordinates: {latitude}, {longitude}'.format(latitude='37.24N', longitude='-115.81W')
'Coordinates: 37.24N, -115.81W'
>>> coord = {'latitude': '37.24N', 'longitude': '-115.81W'}
>>> 'Coordinates: {latitude}, {longitude}'.format({}coord)
'Coordinates: 37.24N, -115.81W'
Accessing arguments' attributes:: >
>>> c = 3-5j
>>> ('The complex number {0} is formed from the real part {0.real} '
... 'and the imaginary part {0.imag}.').format(c)
'The complex number (3-5j) is formed from the real part 3.0 and the imaginary part -5.0.'
>>> class Point(object):
... def __init__(self, x, y):
... self.x, self.y = x, y
... def __str__(self):
... return 'Point({self.x}, {self.y})'.format(self=self)
...
>>> str(Point(4, 2))
'Point(4, 2)'
<
Accessing arguments' items::
>>> coord = (3, 5)
>>> 'X: {0[0]}; Y: {0[1]}'.format(coord)
'X: 3; Y: 5'
Replacing ``%s`` and ``%r``:: >
>>> "repr() shows quotes: {!r}; str() doesn't: {!s}".format('test1', 'test2')
"repr() shows quotes: 'test1'; str() doesn't: test2"
<
Aligning the text and specifying a width::
>>> '{:<30}'.format('left aligned')
'left aligned '
>>> '{:>30}'.format('right aligned')
' right aligned'
>>> '{:^30}'.format('centered')
' centered '
>>> '{:{^30}'.format('centered') # use '}' as a fill char
'{centered}{}'
Replacing ``%+f``, ``%-f``, and ``% f`` and specifying a sign:: >
>>> '{:+f}; {:+f}'.format(3.14, -3.14) # show it always
'+3.140000; -3.140000'
>>> '{: f}; {: f}'.format(3.14, -3.14) # show a space for positive numbers
' 3.140000; -3.140000'
>>> '{:-f}; {:-f}'.format(3.14, -3.14) # show only the minus -- same as '{:f}; {:f}'
'3.140000; -3.140000'
<
Replacing ``%x`` and ``%o`` and converting the value to different bases::
>>> # format also supports binary numbers
>>> "int: {0:d}; hex: {0:x}; oct: {0:o}; bin: {0:b}".format(42)
'int: 42; hex: 2a; oct: 52; bin: 101010'
>>> # with 0x, 0o, or 0b as prefix:
>>> "int: {0:d}; hex: {0:#x}; oct: {0:#o}; bin: {0:#b}".format(42)
'int: 42; hex: 0x2a; oct: 0o52; bin: 0b101010'
Using the comma as a thousands separator:: >
>>> '{:,}'.format(1234567890)
'1,234,567,890'
<
Expressing a percentage::
>>> points = 19.5
>>> total = 22
>>> 'Correct answers: {:.2%}.'.format(points/total)
'Correct answers: 88.64%'
Using type-specific formatting:: >
>>> import datetime
>>> d = datetime.datetime(2010, 7, 4, 12, 15, 58)
>>> '{:%Y-%m-%d %H:%M:%S}'.format(d)
'2010-07-04 12:15:58'
<
Nesting arguments and more complex examples::
>>> for align, text in zip('<^>', ['left', 'center', 'right']):
... '{0:{align}{fill}16}'.format(text, fill=align, align=align)
...
'left<<<<<<<<<<<<'
'^^^^^center^^^^^'
'>>>>>>>>>>>right'
>>>
>>> octets = [192, 168, 0, 1]
>>> '{:02X}{:02X}{:02X}{:02X}'.format(*octets)
'C0A80001'
>>> int(_, 16)
3232235521
>>>
>>> width = 5
>>> for num in range(5,12):
... for base in 'dXob':
... print '{0:{width}{base}}'.format(num, base=base, width=width),
... print
...
5 5 5 101
6 6 6 110
7 7 7 111
8 8 10 1000
9 9 11 1001
10 A 12 1010
11 B 13 1011
Template strings
----------------
.. versionadded:: 2.4
Templates provide simpler string substitutions as described in 292.
Instead of the normal ``%``\ -based substitutions, Templates support ``$``\
-based substitutions, using the following rules:
* ``$$`` is an escape; it is replaced with a single ``$``.
* ``$identifier`` names a substitution placeholder matching a mapping key of
``"identifier"``. By default, ``"identifier"`` must spell a Python
identifier. The first non-identifier character after the ``$`` character
terminates this placeholder specification.
* ``${identifier}`` is equivalent to ``$identifier``. It is required when valid
identifier characters follow the placeholder but are not part of the
placeholder, such as ``"${noun}ification"``.
Any other appearance of ``$`` in the string will result in a ValueError
being raised.
The string (|py2stdlib-string|) module provides a Template class that implements
these rules. The methods of Template are:
Template(template)~
The constructor takes a single argument which is the template string.
substitute(mapping[, {}kws])~
Performs the template substitution, returning a new string. {mapping} is
any dictionary-like object with keys that match the placeholders in the
template. Alternatively, you can provide keyword arguments, where the
keywords are the placeholders. When both {mapping} and {kws} are given
and there are duplicates, the placeholders from {kws} take precedence.
safe_substitute(mapping[, {}kws])~
Like substitute, except that if placeholders are missing from
{mapping} and {kws}, instead of raising a KeyError exception, the
original placeholder will appear in the resulting string intact. Also,
unlike with substitute, any other appearances of the ``$`` will
simply return ``$`` instead of raising ValueError.
While other exceptions may still occur, this method is called "safe"
because substitutions always tries to return a usable string instead of
raising an exception. In another sense, safe_substitute may be
anything other than safe, since it will silently ignore malformed
templates containing dangling delimiters, unmatched braces, or
placeholders that are not valid Python identifiers.
Template instances also provide one public data attribute:
template~
This is the object passed to the constructor's {template} argument. In
general, you shouldn't change it, but read-only access is not enforced.
Here is an example of how to use a Template:
>>> from string import Template
>>> s = Template('$who likes $what')
>>> s.substitute(who='tim', what='kung pao')
'tim likes kung pao'
>>> d = dict(who='tim')
>>> Template('Give $who $100').substitute(d)
Traceback (most recent call last):
[...]
ValueError: Invalid placeholder in string: line 1, col 10
>>> Template('$who likes $what').substitute(d)
Traceback (most recent call last):
[...]
KeyError: 'what'
>>> Template('$who likes $what').safe_substitute(d)
'tim likes $what'
Advanced usage: you can derive subclasses of Template to customize the
placeholder syntax, delimiter character, or the entire regular expression used
to parse template strings. To do this, you can override these class attributes:
{ }delimiter* -- This is the literal string describing a placeholder introducing
delimiter. The default value ``$``. Note that this should {not} be a regular
expression, as the implementation will call re.escape on this string as
needed.
{ }idpattern* -- This is the regular expression describing the pattern for
non-braced placeholders (the braces will be added automatically as
appropriate). The default value is the regular expression
``[_a-z][_a-z0-9]*``.
Alternatively, you can provide the entire regular expression pattern by
overriding the class attribute {pattern}. If you do this, the value must be a
regular expression object with four named capturing groups. The capturing
groups correspond to the rules given above, along with the invalid placeholder
rule:
{ }escaped* -- This group matches the escape sequence, e.g. ``$$``, in the
default pattern.
{ }named* -- This group matches the unbraced placeholder name; it should not
include the delimiter in capturing group.
{ }braced* -- This group matches the brace enclosed placeholder name; it should
not include either the delimiter or braces in the capturing group.
{ }invalid* -- This group matches any other delimiter pattern (usually a single
delimiter), and it should appear last in the regular expression.
String functions
----------------
The following functions are available to operate on string and Unicode objects.
They are not available as string methods.
capwords(s[, sep])~
Split the argument into words using str.split, capitalize each word
using str.capitalize, and join the capitalized words using
str.join. If the optional second argument {sep} is absent
or ``None``, runs of whitespace characters are replaced by a single space
and leading and trailing whitespace are removed, otherwise {sep} is used to
split and join the words.
maketrans(from, to)~
Return a translation table suitable for passing to translate, that will
map each character in {from} into the character at the same position in {to};
{from} and {to} must have the same length.
.. note:: >
Don't use strings derived from lowercase and uppercase as
arguments; in some locales, these don't have the same length. For case
conversions, always use str.lower and str.upper.
<
Deprecated string functions
The following list of functions are also defined as methods of string and
Unicode objects; see section string-methods for more information on
those. You should consider these functions as deprecated, although they will
not be removed until Python 3.0. The functions defined in this module are:
atof(s)~
2.0~
Use the float built-in function.
.. index:: builtin: float
Convert a string to a floating point number. The string must have the standard
syntax for a floating point literal in Python, optionally preceded by a sign
(``+`` or ``-``). Note that this behaves identical to the built-in function
float when passed a string.
.. note:: >
.. index::
single: NaN
single: Infinity
When passing in a string, values for NaN and Infinity may be returned, depending
on the underlying C library. The specific set of strings accepted which cause
these values to be returned depends entirely on the C library and is known to
vary.
<
atoi(s[, base])~
2.0~
Use the int built-in function.
.. index:: builtin: eval
Convert string {s} to an integer in the given {base}. The string must consist
of one or more digits, optionally preceded by a sign (``+`` or ``-``). The
{base} defaults to 10. If it is 0, a default base is chosen depending on the
leading characters of the string (after stripping the sign): ``0x`` or ``0X``
means 16, ``0`` means 8, anything else means 10. If {base} is 16, a leading
``0x`` or ``0X`` is always accepted, though not required. This behaves
identically to the built-in function int when passed a string. (Also
note: for a more flexible interpretation of numeric literals, use the built-in
function eval.)
atol(s[, base])~
2.0~
Use the long built-in function.
.. index:: builtin: long
Convert string {s} to a long integer in the given {base}. The string must
consist of one or more digits, optionally preceded by a sign (``+`` or ``-``).
The {base} argument has the same meaning as for atoi. A trailing ``l``
or ``L`` is not allowed, except if the base is 0. Note that when invoked
without {base} or with {base} set to 10, this behaves identical to the built-in
function long when passed a string.
capitalize(word)~
Return a copy of {word} with only its first character capitalized.
expandtabs(s[, tabsize])~
Expand tabs in a string replacing them by one or more spaces, depending on the
current column and the given tab size. The column number is reset to zero after
each newline occurring in the string. This doesn't understand other non-printing
characters or escape sequences. The tab size defaults to 8.
find(s, sub[, start[,end]])~
Return the lowest index in {s} where the substring {sub} is found such that
{sub} is wholly contained in ``s[start:end]``. Return ``-1`` on failure.
Defaults for {start} and {end} and interpretation of negative values is the same
as for slices.
rfind(s, sub[, start[, end]])~
Like find but find the highest index.
index(s, sub[, start[, end]])~
Like find but raise ValueError when the substring is not found.
rindex(s, sub[, start[, end]])~
Like rfind but raise ValueError when the substring is not found.
count(s, sub[, start[, end]])~
Return the number of (non-overlapping) occurrences of substring {sub} in string
``s[start:end]``. Defaults for {start} and {end} and interpretation of negative
values are the same as for slices.
lower(s)~
Return a copy of {s}, but with upper case letters converted to lower case.
split(s[, sep[, maxsplit]])~
Return a list of the words of the string {s}. If the optional second argument
{sep} is absent or ``None``, the words are separated by arbitrary strings of
whitespace characters (space, tab, newline, return, formfeed). If the second
argument {sep} is present and not ``None``, it specifies a string to be used as
the word separator. The returned list will then have one more item than the
number of non-overlapping occurrences of the separator in the string. The
optional third argument {maxsplit} defaults to 0. If it is nonzero, at most
{maxsplit} number of splits occur, and the remainder of the string is returned
as the final element of the list (thus, the list will have at most
``maxsplit+1`` elements).
The behavior of split on an empty string depends on the value of {sep}. If {sep}
is not specified, or specified as ``None``, the result will be an empty list.
If {sep} is specified as any string, the result will be a list containing one
element which is an empty string.
rsplit(s[, sep[, maxsplit]])~
Return a list of the words of the string {s}, scanning {s} from the end. To all
intents and purposes, the resulting list of words is the same as returned by
split, except when the optional third argument {maxsplit} is explicitly
specified and nonzero. When {maxsplit} is nonzero, at most {maxsplit} number of
splits -- the {rightmost} ones -- occur, and the remainder of the string is
returned as the first element of the list (thus, the list will have at most
``maxsplit+1`` elements).
.. versionadded:: 2.4
splitfields(s[, sep[, maxsplit]])~
This function behaves identically to split. (In the past, split
was only used with one argument, while splitfields was only used with
two arguments.)
join(words[, sep])~
Concatenate a list or tuple of words with intervening occurrences of {sep}.
The default value for {sep} is a single space character. It is always true that
``string.join(string.split(s, sep), sep)`` equals {s}.
joinfields(words[, sep])~
This function behaves identically to join. (In the past, join
was only used with one argument, while joinfields was only used with two
arguments.) Note that there is no joinfields method on string objects;
use the join method instead.
lstrip(s[, chars])~
Return a copy of the string with leading characters removed. If {chars} is
omitted or ``None``, whitespace characters are removed. If given and not
``None``, {chars} must be a string; the characters in the string will be
stripped from the beginning of the string this method is called on.
.. versionchanged:: 2.2.3
The {chars} parameter was added. The {chars} parameter cannot be passed in
earlier 2.2 versions.
rstrip(s[, chars])~
Return a copy of the string with trailing characters removed. If {chars} is
omitted or ``None``, whitespace characters are removed. If given and not
``None``, {chars} must be a string; the characters in the string will be
stripped from the end of the string this method is called on.
.. versionchanged:: 2.2.3
The {chars} parameter was added. The {chars} parameter cannot be passed in
earlier 2.2 versions.
strip(s[, chars])~
Return a copy of the string with leading and trailing characters removed. If
{chars} is omitted or ``None``, whitespace characters are removed. If given and
not ``None``, {chars} must be a string; the characters in the string will be
stripped from the both ends of the string this method is called on.
.. versionchanged:: 2.2.3
The {chars} parameter was added. The {chars} parameter cannot be passed in
earlier 2.2 versions.
swapcase(s)~
Return a copy of {s}, but with lower case letters converted to upper case and
vice versa.
translate(s, table[, deletechars])~
Delete all characters from {s} that are in {deletechars} (if present), and then
translate the characters using {table}, which must be a 256-character string
giving the translation for each character value, indexed by its ordinal. If
{table} is ``None``, then only the character deletion step is performed.
upper(s)~
Return a copy of {s}, but with lower case letters converted to upper case.
ljust(s, width[, fillchar])~
rjust(s, width[, fillchar])
center(s, width[, fillchar])
These functions respectively left-justify, right-justify and center a string in
a field of given width. They return a string that is at least {width}
characters wide, created by padding the string {s} with the character {fillchar}
(default is a space) until the given width on the right, left or both sides.
The string is never truncated.
zfill(s, width)~
Pad a numeric string on the left with zero digits until the given width is
reached. Strings starting with a sign are handled correctly.
replace(str, old, new[, maxreplace])~
Return a copy of string {str} with all occurrences of substring {old} replaced
by {new}. If the optional argument {maxreplace} is given, the first
{maxreplace} occurrences are replaced.
==============================================================================
*py2stdlib-stringio*
StringIO~
:synopsis: Read and write strings as if they were files.
This module implements a file-like class, StringIO (|py2stdlib-stringio|), that reads and
writes a string buffer (also known as {memory files}). See the description of
file objects for operations (section bltin-file-objects). (For
standard strings, see str and unicode.)
StringIO([buffer])~
When a StringIO (|py2stdlib-stringio|) object is created, it can be initialized to an existing
string by passing the string to the constructor. If no string is given, the
StringIO (|py2stdlib-stringio|) will start empty. In both cases, the initial file position
starts at zero.
The StringIO (|py2stdlib-stringio|) object can accept either Unicode or 8-bit strings, but
mixing the two may take some care. If both are used, 8-bit strings that cannot
be interpreted as 7-bit ASCII (that use the 8th bit) will cause a
UnicodeError to be raised when getvalue is called.
The following methods of StringIO (|py2stdlib-stringio|) objects require special mention:
StringIO.getvalue()~
Retrieve the entire contents of the "file" at any time before the
StringIO (|py2stdlib-stringio|) object's close method is called. See the note above
for information about mixing Unicode and 8-bit strings; such mixing can cause
this method to raise UnicodeError.
StringIO.close()~
Free the memory buffer. Attempting to do further operations with a closed
StringIO (|py2stdlib-stringio|) object will raise a ValueError.
Example usage:: >
import StringIO
output = StringIO.StringIO()
output.write('First line.\n')
print >>output, 'Second line.'
# Retrieve file contents -- this will be
# 'First line.\nSecond line.\n'
contents = output.getvalue()
# Close object and discard memory buffer --
# .getvalue() will now raise an exception.
output.close()
==============================================================================
*py2stdlib-stringprep*
stringprep~
:synopsis: String preparation, as per RFC 3453
:deprecated:
.. versionadded:: 2.3
When identifying things (such as host names) in the internet, it is often
necessary to compare such identifications for "equality". Exactly how this
comparison is executed may depend on the application domain, e.g. whether it
should be case-insensitive or not. It may be also necessary to restrict the
possible identifications, to allow only identifications consisting of
"printable" characters.
3454 defines a procedure for "preparing" Unicode strings in internet
protocols. Before passing strings onto the wire, they are processed with the
preparation procedure, after which they have a certain normalized form. The RFC
defines a set of tables, which can be combined into profiles. Each profile must
define which tables it uses, and what other optional parts of the ``stringprep``
procedure are part of the profile. One example of a ``stringprep`` profile is
``nameprep``, which is used for internationalized domain names.
The module stringprep (|py2stdlib-stringprep|) only exposes the tables from RFC 3454. As these
tables would be very large to represent them as dictionaries or lists, the
module uses the Unicode character database internally. The module source code
itself was generated using the ``mkstringprep.py`` utility.
As a result, these tables are exposed as functions, not as data structures.
There are two kinds of tables in the RFC: sets and mappings. For a set,
stringprep (|py2stdlib-stringprep|) provides the "characteristic function", i.e. a function that
returns true if the parameter is part of the set. For mappings, it provides the
mapping function: given the key, it returns the associated value. Below is a
list of all functions available in the module.
in_table_a1(code)~
Determine whether {code} is in tableA.1 (Unassigned code points in Unicode 3.2).
in_table_b1(code)~
Determine whether {code} is in tableB.1 (Commonly mapped to nothing).
map_table_b2(code)~
Return the mapped value for {code} according to tableB.2 (Mapping for
case-folding used with NFKC).
map_table_b3(code)~
Return the mapped value for {code} according to tableB.3 (Mapping for
case-folding used with no normalization).
in_table_c11(code)~
Determine whether {code} is in tableC.1.1 (ASCII space characters).
in_table_c12(code)~
Determine whether {code} is in tableC.1.2 (Non-ASCII space characters).
in_table_c11_c12(code)~
Determine whether {code} is in tableC.1 (Space characters, union of C.1.1 and
C.1.2).
in_table_c21(code)~
Determine whether {code} is in tableC.2.1 (ASCII control characters).
in_table_c22(code)~
Determine whether {code} is in tableC.2.2 (Non-ASCII control characters).
in_table_c21_c22(code)~
Determine whether {code} is in tableC.2 (Control characters, union of C.2.1 and
C.2.2).
in_table_c3(code)~
Determine whether {code} is in tableC.3 (Private use).
in_table_c4(code)~
Determine whether {code} is in tableC.4 (Non-character code points).
in_table_c5(code)~
Determine whether {code} is in tableC.5 (Surrogate codes).
in_table_c6(code)~
Determine whether {code} is in tableC.6 (Inappropriate for plain text).
in_table_c7(code)~
Determine whether {code} is in tableC.7 (Inappropriate for canonical
representation).
in_table_c8(code)~
Determine whether {code} is in tableC.8 (Change display properties or are
deprecated).
in_table_c9(code)~
Determine whether {code} is in tableC.9 (Tagging characters).
in_table_d1(code)~
Determine whether {code} is in tableD.1 (Characters with bidirectional property
"R" or "AL").
in_table_d2(code)~
Determine whether {code} is in tableD.2 (Characters with bidirectional property
"L").
==============================================================================
*py2stdlib-struct*
struct~
:synopsis: Interpret strings as packed binary data.
.. index::
pair: C; structures
triple: packing; binary; data
This module performs conversions between Python values and C structs represented
as Python strings. This can be used in handling binary data stored in files or
from network connections, among other sources. It uses
struct-format-strings as compact descriptions of the layout of the C
structs and the intended conversion to/from Python values.
.. note::
By default, the result of packing a given C struct includes pad bytes in
order to maintain proper alignment for the C types involved; similarly,
alignment is taken into account when unpacking. This behavior is chosen so
that the bytes of a packed struct correspond exactly to the layout in memory
of the corresponding C struct. To handle platform-independent data formats
or omit implicit pad bytes, use `standard` size and alignment instead of
`native` size and alignment: see struct-alignment for details.
Functions and Exceptions
------------------------
The module defines the following exception and functions:
error~
Exception raised on various occasions; argument is a string describing what
is wrong.
pack(fmt, v1, v2, ...)~
Return a string containing the values ``v1, v2, ...`` packed according to the
given format. The arguments must match the values required by the format
exactly.
pack_into(fmt, buffer, offset, v1, v2, ...)~
Pack the values ``v1, v2, ...`` according to the given format, write the
packed bytes into the writable {buffer} starting at {offset}. Note that the
offset is a required argument.
.. versionadded:: 2.5
unpack(fmt, string)~
Unpack the string (presumably packed by ``pack(fmt, ...)``) according to the
given format. The result is a tuple even if it contains exactly one item.
The string must contain exactly the amount of data required by the format
(``len(string)`` must equal ``calcsize(fmt)``).
unpack_from(fmt, buffer[,offset=0])~
Unpack the {buffer} according to the given format. The result is a tuple even
if it contains exactly one item. The {buffer} must contain at least the
amount of data required by the format (``len(buffer[offset:])`` must be at
least ``calcsize(fmt)``).
.. versionadded:: 2.5
calcsize(fmt)~
Return the size of the struct (and hence of the string) corresponding to the
given format.
Format Strings
--------------
Format strings are the mechanism used to specify the expected layout when
packing and unpacking data. They are built up from format-characters,
which specify the type of data being packed/unpacked. In addition, there are
special characters for controlling the struct-alignment.
Byte Order, Size, and Alignment
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
By default, C types are represented in the machine's native format and byte
order, and properly aligned by skipping pad bytes if necessary (according to the
rules used by the C compiler).
Alternatively, the first character of the format string can be used to indicate
the byte order, size and alignment of the packed data, according to the
following table:
+-----------+------------------------+----------+-----------+
| Character | Byte order | Size | Alignment |
+===========+========================+==========+===========+
| ``@`` | native | native | native |
+-----------+------------------------+----------+-----------+
| ``=`` | native | standard | none |
+-----------+------------------------+----------+-----------+
| ``<`` | little-endian | standard | none |
+-----------+------------------------+----------+-----------+
| ``>`` | big-endian | standard | none |
+-----------+------------------------+----------+-----------+
| ``!`` | network (= big-endian) | standard | none |
+-----------+------------------------+----------+-----------+
If the first character is not one of these, ``'@'`` is assumed.
Native byte order is big-endian or little-endian, depending on the host
system. For example, Intel x86 and AMD64 (x86-64) are little-endian;
Motorola 68000 and PowerPC G5 are big-endian; ARM and Intel Itanium feature
switchable endianness (bi-endian). Use ``sys.byteorder`` to check the
endianness of your system.
Native size and alignment are determined using the C compiler's
``sizeof`` expression. This is always combined with native byte order.
Standard size depends only on the format character; see the table in
the format-characters section.
Note the difference between ``'@'`` and ``'='``: both use native byte order, but
the size and alignment of the latter is standardized.
The form ``'!'`` is available for those poor souls who claim they can't remember
whether network byte order is big-endian or little-endian.
There is no way to indicate non-native byte order (force byte-swapping); use the
appropriate choice of ``'<'`` or ``'>'``.
Notes:
(1) Padding is only automatically added between successive structure members.
No padding is added at the beginning or the end of the encoded struct.
(2) No padding is added when using non-native size and alignment, e.g.
with '<', '>', '=', and '!'.
(3) To align the end of a structure to the alignment requirement of a
particular type, end the format with the code for that type with a repeat
count of zero. See struct-examples.
Format Characters
^^^^^^^^^^^^^^^^^
Format characters have the following meaning; the conversion between C and
Python values should be obvious given their types. The 'Standard size' column
refers to the size of the packed value in bytes when using standard size; that
is, when the format string starts with one of ``'<'``, ``'>'``, ``'!'`` or
``'='``. When using native size, the size of the packed value is
platform-dependent.
+--------+-------------------------+--------------------+----------------+------------+
| Format | C Type | Python type | Standard size | Notes |
+========+=========================+====================+================+============+
| ``x`` | pad byte | no value | | |
+--------+-------------------------+--------------------+----------------+------------+
| ``c`` | char | string of length 1 | 1 | |
+--------+-------------------------+--------------------+----------------+------------+
| ``b`` | signed char | integer | 1 | \(3) |
+--------+-------------------------+--------------------+----------------+------------+
| ``B`` | unsigned char | integer | 1 | \(3) |
+--------+-------------------------+--------------------+----------------+------------+
| ``?`` | _Bool | bool | 1 | \(1) |
+--------+-------------------------+--------------------+----------------+------------+
| ``h`` | short | integer | 2 | \(3) |
+--------+-------------------------+--------------------+----------------+------------+
| ``H`` | unsigned short | integer | 2 | \(3) |
+--------+-------------------------+--------------------+----------------+------------+
| ``i`` | int | integer | 4 | \(3) |
+--------+-------------------------+--------------------+----------------+------------+
| ``I`` | unsigned int | integer | 4 | \(3) |
+--------+-------------------------+--------------------+----------------+------------+
| ``l`` | long | integer | 4 | \(3) |
+--------+-------------------------+--------------------+----------------+------------+
| ``L`` | unsigned long | integer | 4 | \(3) |
+--------+-------------------------+--------------------+----------------+------------+
| ``q`` | long long | integer | 8 | \(2), \(3) |
+--------+-------------------------+--------------------+----------------+------------+
| ``Q`` | :ctype:`unsigned long | integer | 8 | \(2), \(3) |
| | long` | | | |
+--------+-------------------------+--------------------+----------------+------------+
| ``f`` | float | float | 4 | \(4) |
+--------+-------------------------+--------------------+----------------+------------+
| ``d`` | double | float | 8 | \(4) |
+--------+-------------------------+--------------------+----------------+------------+
| ``s`` | char[] | string | | |
+--------+-------------------------+--------------------+----------------+------------+
| ``p`` | char[] | string | | |
+--------+-------------------------+--------------------+----------------+------------+
| ``P`` | void \* | integer | | \(5), \(3) |
+--------+-------------------------+--------------------+----------------+------------+
Notes:
(1)
The ``'?'`` conversion code corresponds to the _Bool type defined by
C99. If this type is not available, it is simulated using a char. In
standard mode, it is always represented by one byte.
.. versionadded:: 2.6
(2)
The ``'q'`` and ``'Q'`` conversion codes are available in native mode only if
the platform C compiler supports C long long, or, on Windows,
__int64. They are always available in standard modes.
.. versionadded:: 2.2
(3)
When attempting to pack a non-integer using any of the integer conversion
codes, if the non-integer has a __index__ method then that method is
called to convert the argument to an integer before packing. If no
__index__ method exists, or the call to __index__ raises
TypeError, then the __int__ method is tried. However, the use
of __int__ is deprecated, and will raise DeprecationWarning.
.. versionchanged:: 2.7
Use of the __index__ method for non-integers is new in 2.7.
.. versionchanged:: 2.7
Prior to version 2.7, not all integer conversion codes would use the
__int__ method to convert, and DeprecationWarning was
raised only for float arguments.
(4)
For the ``'f'`` and ``'d'`` conversion codes, the packed representation uses
the IEEE 754 binary32 (for ``'f'``) or binary64 (for ``'d'``) format,
regardless of the floating-point format used by the platform.
(5)
The ``'P'`` format character is only available for the native byte ordering
(selected as the default or with the ``'@'`` byte order character). The byte
order character ``'='`` chooses to use little- or big-endian ordering based
on the host system. The struct module does not interpret this as native
ordering, so the ``'P'`` format is not available.
A format character may be preceded by an integral repeat count. For example,
the format string ``'4h'`` means exactly the same as ``'hhhh'``.
Whitespace characters between formats are ignored; a count and its format must
not contain whitespace though.
For the ``'s'`` format character, the count is interpreted as the size of the
string, not a repeat count like for the other format characters; for example,
``'10s'`` means a single 10-byte string, while ``'10c'`` means 10 characters.
For packing, the string is truncated or padded with null bytes as appropriate to
make it fit. For unpacking, the resulting string always has exactly the
specified number of bytes. As a special case, ``'0s'`` means a single, empty
string (while ``'0c'`` means 0 characters).
The ``'p'`` format character encodes a "Pascal string", meaning a short
variable-length string stored in a fixed number of bytes. The count is the total
number of bytes stored. The first byte stored is the length of the string, or
255, whichever is smaller. The bytes of the string follow. If the string
passed in to pack is too long (longer than the count minus 1), only the
leading count-1 bytes of the string are stored. If the string is shorter than
count-1, it is padded with null bytes so that exactly count bytes in all are
used. Note that for unpack, the ``'p'`` format character consumes count
bytes, but that the string returned can never contain more than 255 characters.
For the ``'P'`` format character, the return value is a Python integer or long
integer, depending on the size needed to hold a pointer when it has been cast to
an integer type. A {NULL} pointer will always be returned as the Python integer
``0``. When packing pointer-sized values, Python integer or long integer objects
may be used. For example, the Alpha and Merced processors use 64-bit pointer
values, meaning a Python long integer will be used to hold the pointer; other
platforms use 32-bit pointers and will use a Python integer.
For the ``'?'`` format character, the return value is either True or
False. When packing, the truth value of the argument object is used.
Either 0 or 1 in the native or standard bool representation will be packed, and
any non-zero value will be True when unpacking.
Examples
^^^^^^^^
.. note::
All examples assume a native byte order, size, and alignment with a
big-endian machine.
A basic example of packing/unpacking three integers:: >
>>> from struct import *
>>> pack('hhl', 1, 2, 3)
'\x00\x01\x00\x02\x00\x00\x00\x03'
>>> unpack('hhl', '\x00\x01\x00\x02\x00\x00\x00\x03')
(1, 2, 3)
>>> calcsize('hhl')
8
<
Unpacked fields can be named by assigning them to variables or by wrapping
the result in a named tuple:: >
>>> record = 'raymond \x32\x12\x08\x01\x08'
>>> name, serialnum, school, gradelevel = unpack('<10sHHb', record)
>>> from collections import namedtuple
>>> Student = namedtuple('Student', 'name serialnum school gradelevel')
>>> Student._make(unpack('<10sHHb', s))
Student(name='raymond ', serialnum=4658, school=264, gradelevel=8)
<
The ordering of format characters may have an impact on size since the padding
needed to satisfy alignment requirements is different:: >
>>> pack('ci', '*', 0x12131415)
'*\x00\x00\x00\x12\x13\x14\x15'
>>> pack('ic', 0x12131415, '*')
'\x12\x13\x14\x15*'
>>> calcsize('ci')
8
>>> calcsize('ic')
5
<
The following format ``'llh0l'`` specifies two pad bytes at the end, assuming
longs are aligned on 4-byte boundaries:: >
>>> pack('llh0l', 1, 2, 3)
'\x00\x00\x00\x01\x00\x00\x00\x02\x00\x03\x00\x00'
<
This only works when native size and alignment are in effect; standard size and
alignment does not enforce any alignment.
.. seealso::
Module array (|py2stdlib-array|)
Packed binary storage of homogeneous data.
Module xdrlib (|py2stdlib-xdrlib|)
Packing and unpacking of XDR data.
Classes
-------
The struct (|py2stdlib-struct|) module also defines the following type:
Struct(format)~
Return a new Struct object which writes and reads binary data according to
the format string {format}. Creating a Struct object once and calling its
methods is more efficient than calling the struct (|py2stdlib-struct|) functions with the
same format since the format string only needs to be compiled once.
.. versionadded:: 2.5
Compiled Struct objects support the following methods and attributes:
pack(v1, v2, ...)~
Identical to the pack function, using the compiled format.
(``len(result)`` will equal self.size.)
pack_into(buffer, offset, v1, v2, ...)~
Identical to the pack_into function, using the compiled format.
unpack(string)~
Identical to the unpack function, using the compiled format.
(``len(string)`` must equal self.size).
unpack_from(buffer[, offset=0])~
Identical to the unpack_from function, using the compiled format.
(``len(buffer[offset:])`` must be at least self.size).
format~
The format string used to construct this Struct object.
size~
The calculated size of the struct (and hence of the string) corresponding
to format.
==============================================================================
*py2stdlib-subprocess*
subprocess~
:synopsis: Subprocess management.
.. versionadded:: 2.4
The subprocess (|py2stdlib-subprocess|) module allows you to spawn new processes, connect to their
input/output/error pipes, and obtain their return codes. This module intends to
replace several other, older modules and functions, such as:: >
os.system
os.spawn*
os.popen*
popen2.*
commands.*
<
Information about how the subprocess (|py2stdlib-subprocess|) module can be used to replace these
modules and functions can be found in the following sections.
.. seealso::
324 -- PEP proposing the subprocess module
Using the subprocess Module
---------------------------
This module defines one class called Popen:
Popen(args, bufsize=0, executable=None, stdin=None, stdout=None, stderr=None, preexec_fn=None, close_fds=False, shell=False, cwd=None, env=None, universal_newlines=False, startupinfo=None, creationflags=0)~
Arguments are:
{args} should be a string, or a sequence of program arguments. The program
to execute is normally the first item in the args sequence or the string if
a string is given, but can be explicitly set by using the {executable}
argument. When {executable} is given, the first item in the args sequence
is still treated by most programs as the command name, which can then be
different from the actual executable name. On Unix, it becomes the display
name for the executing program in utilities such as ps.
On Unix, with {shell=False} (default): In this case, the Popen class uses
os.execvp to execute the child program. {args} should normally be a
sequence. If a string is specified for {args}, it will be used as the name
or path of the program to execute; this will only work if the program is
being given no arguments.
.. note:: >
shlex.split can be useful when determining the correct
tokenization for {args}, especially in complex cases::
>>> import shlex, subprocess
>>> command_line = raw_input()
/bin/vikings -input eggs.txt -output "spam spam.txt" -cmd "echo '$MONEY'"
>>> args = shlex.split(command_line)
>>> print args
['/bin/vikings', '-input', 'eggs.txt', '-output', 'spam spam.txt', '-cmd', "echo '$MONEY'"]
>>> p = subprocess.Popen(args) # Success!
Note in particular that options (such as {-input}) and arguments (such
as {eggs.txt}) that are separated by whitespace in the shell go in separate
list elements, while arguments that need quoting or backslash escaping when
used in the shell (such as filenames containing spaces or the {echo} command
shown above) are single list elements.
<
On Unix, with {shell=True}: If args is a string, it specifies the command
string to execute through the shell. This means that the string must be
formatted exactly as it would be when typed at the shell prompt. This
includes, for example, quoting or backslash escaping filenames with spaces in
them. If {args} is a sequence, the first item specifies the command string, and
any additional items will be treated as additional arguments to the shell
itself. That is to say, {Popen} does the equivalent of:: >
Popen(['/bin/sh', '-c', args[0], args[1], ...])
<
On Windows: the Popen class uses CreateProcess() to execute the child
program, which operates on strings. If {args} is a sequence, it will be
converted to a string using the list2cmdline method. Please note that
not all MS Windows applications interpret the command line the same way:
list2cmdline is designed for applications using the same rules as the MS
C runtime.
{bufsize}, if given, has the same meaning as the corresponding argument to the
built-in open() function: 0 means unbuffered, 1 means line
buffered, any other positive value means use a buffer of (approximately) that
size. A negative {bufsize} means to use the system default, which usually means
fully buffered. The default value for {bufsize} is 0 (unbuffered).
.. note:: >
If you experience performance issues, it is recommended that you try to
enable buffering by setting {bufsize} to either -1 or a large enough
positive value (such as 4096).
<
The {executable} argument specifies the program to execute. It is very seldom
needed: Usually, the program to execute is defined by the {args} argument. If
``shell=True``, the {executable} argument specifies which shell to use. On Unix,
the default shell is /bin/sh. On Windows, the default shell is
specified by the COMSPEC environment variable. The only reason you
would need to specify ``shell=True`` on Windows is where the command you
wish to execute is actually built in to the shell, eg ``dir``, ``copy``.
You don't need ``shell=True`` to run a batch file, nor to run a console-based
executable.
{stdin}, {stdout} and {stderr} specify the executed programs' standard input,
standard output and standard error file handles, respectively. Valid values
are PIPE, an existing file descriptor (a positive integer), an
existing file object, and ``None``. PIPE indicates that a new pipe
to the child should be created. With ``None``, no redirection will occur;
the child's file handles will be inherited from the parent. Additionally,
{stderr} can be STDOUT, which indicates that the stderr data from the
applications should be captured into the same file handle as for stdout.
If {preexec_fn} is set to a callable object, this object will be called in the
child process just before the child is executed. (Unix only)
If {close_fds} is true, all file descriptors except 0, 1 and
2 will be closed before the child process is executed. (Unix only).
Or, on Windows, if {close_fds} is true then no handles will be inherited by the
child process. Note that on Windows, you cannot set {close_fds} to true and
also redirect the standard handles by setting {stdin}, {stdout} or {stderr}.
If {shell} is True, the specified command will be executed through the
shell.
If {cwd} is not ``None``, the child's current directory will be changed to {cwd}
before it is executed. Note that this directory is not considered when
searching the executable, so you can't specify the program's path relative to
{cwd}.
If {env} is not ``None``, it must be a mapping that defines the environment
variables for the new process; these are used instead of inheriting the current
process' environment, which is the default behavior.
.. note:: >
If specified, {env} must provide any variables required
for the program to execute. On Windows, in order to run a
`side-by-side assembly`_ the specified {env} {must}* include a valid
SystemRoot.
<
//en.wikipedia.org/wiki/Side-by-Side_Assembly
If {universal_newlines} is True, the file objects stdout and stderr are
opened as text files, but lines may be terminated by any of ``'\n'``, the Unix
end-of-line convention, ``'\r'``, the old Macintosh convention or ``'\r\n'``, the
Windows convention. All of these external representations are seen as ``'\n'``
by the Python program.
.. note:: >
This feature is only available if Python is built with universal newline
support (the default). Also, the newlines attribute of the file objects
stdout, stdin and stderr are not updated by the
communicate() method.
<
The {startupinfo} and {creationflags}, if given, will be passed to the
underlying CreateProcess() function. They can specify things such as appearance
of the main window and priority for the new process. (Windows only)
PIPE~
Special value that can be used as the {stdin}, {stdout} or {stderr} argument
to Popen and indicates that a pipe to the standard stream should be
opened.
STDOUT~
Special value that can be used as the {stderr} argument to Popen and
indicates that standard error should go into the same handle as standard
output.
Convenience Functions
^^^^^^^^^^^^^^^^^^^^^
This module also defines two shortcut functions:
call({popenargs, }*kwargs)~
Run command with arguments. Wait for command to complete, then return the
returncode attribute.
The arguments are the same as for the Popen constructor. Example:: >
>>> retcode = subprocess.call(["ls", "-l"])
<
.. warning::
Like Popen.wait, this will deadlock when using
``stdout=PIPE`` and/or ``stderr=PIPE`` and the child process
generates enough output to a pipe such that it blocks waiting
for the OS pipe buffer to accept more data.
check_call({popenargs, }*kwargs)~
Run command with arguments. Wait for command to complete. If the exit code was
zero then return, otherwise raise CalledProcessError. The
CalledProcessError object will have the return code in the
returncode attribute.
The arguments are the same as for the Popen constructor. Example:: >
>>> subprocess.check_call(["ls", "-l"])
0
<
.. versionadded:: 2.5
.. warning:: >
See the warning for call.
<
check_output({popenargs, }*kwargs)~
Run command with arguments and return its output as a byte string.
If the exit code was non-zero it raises a CalledProcessError. The
CalledProcessError object will have the return code in the
returncode
attribute and output in the output attribute.
The arguments are the same as for the Popen constructor. Example:: >
>>> subprocess.check_output(["ls", "-l", "/dev/null"])
'crw-rw-rw- 1 root root 1, 3 Oct 18 2007 /dev/null\n'
<
The stdout argument is not allowed as it is used internally.
To capture standard error in the result, use ``stderr=subprocess.STDOUT``:: >
>>> subprocess.check_output(
... ["/bin/sh", "-c", "ls non_existent_file; exit 0"],
... stderr=subprocess.STDOUT)
'ls: non_existent_file: No such file or directory\n'
<
.. versionadded:: 2.7
Exceptions
^^^^^^^^^^
Exceptions raised in the child process, before the new program has started to
execute, will be re-raised in the parent. Additionally, the exception object
will have one extra attribute called child_traceback, which is a string
containing traceback information from the childs point of view.
The most common exception raised is OSError. This occurs, for example,
when trying to execute a non-existent file. Applications should prepare for
OSError exceptions.
A ValueError will be raised if Popen is called with invalid
arguments.
check_call() will raise CalledProcessError, if the called process returns
a non-zero return code.
Security
^^^^^^^^
Unlike some other popen functions, this implementation will never call /bin/sh
implicitly. This means that all characters, including shell metacharacters, can
safely be passed to child processes.
Popen Objects
-------------
Instances of the Popen class have the following methods:
Popen.poll()~
Check if child process has terminated. Set and return returncode
attribute.
Popen.wait()~
Wait for child process to terminate. Set and return returncode
attribute.
.. warning:: >
This will deadlock when using ``stdout=PIPE`` and/or
``stderr=PIPE`` and the child process generates enough output to
a pipe such that it blocks waiting for the OS pipe buffer to
accept more data. Use communicate to avoid that.
<
Popen.communicate(input=None)~
Interact with process: Send data to stdin. Read data from stdout and stderr,
until end-of-file is reached. Wait for process to terminate. The optional
{input} argument should be a string to be sent to the child process, or
``None``, if no data should be sent to the child.
communicate returns a tuple ``(stdoutdata, stderrdata)``.
Note that if you want to send data to the process's stdin, you need to create
the Popen object with ``stdin=PIPE``. Similarly, to get anything other than
``None`` in the result tuple, you need to give ``stdout=PIPE`` and/or
``stderr=PIPE`` too.
.. note:: >
The data read is buffered in memory, so do not use this method if the data
size is large or unlimited.
<
Popen.send_signal(signal)~
Sends the signal {signal} to the child.
.. note:: >
On Windows, SIGTERM is an alias for terminate. CTRL_C_EVENT and
CTRL_BREAK_EVENT can be sent to processes started with a {creationflags}
parameter which includes `CREATE_NEW_PROCESS_GROUP`.
<
.. versionadded:: 2.6
Popen.terminate()~
Stop the child. On Posix OSs the method sends SIGTERM to the
child. On Windows the Win32 API function TerminateProcess is called
to stop the child.
.. versionadded:: 2.6
Popen.kill()~
Kills the child. On Posix OSs the function sends SIGKILL to the child.
On Windows kill is an alias for terminate.
.. versionadded:: 2.6
The following attributes are also available:
.. warning::
Use communicate rather than .stdin.write <stdin>,
.stdout.read <stdout> or .stderr.read <stderr> to avoid
deadlocks due to any of the other OS pipe buffers filling up and blocking the
child process.
Popen.stdin~
If the {stdin} argument was PIPE, this attribute is a file object
that provides input to the child process. Otherwise, it is ``None``.
Popen.stdout~
If the {stdout} argument was PIPE, this attribute is a file object
that provides output from the child process. Otherwise, it is ``None``.
Popen.stderr~
If the {stderr} argument was PIPE, this attribute is a file object
that provides error output from the child process. Otherwise, it is
``None``.
Popen.pid~
The process ID of the child process.
Note that if you set the {shell} argument to ``True``, this is the process ID
of the spawned shell.
Popen.returncode~
The child return code, set by poll and wait (and indirectly
by communicate). A ``None`` value indicates that the process
hasn't terminated yet.
A negative value ``-N`` indicates that the child was terminated by signal
``N`` (Unix only).
Replacing Older Functions with the subprocess Module
----------------------------------------------------
In this section, "a ==> b" means that b can be used as a replacement for a.
.. note::
All functions in this section fail (more or less) silently if the executed
program cannot be found; this module raises an OSError exception.
In the following examples, we assume that the subprocess module is imported with
"from subprocess import \*".
Replacing /bin/sh shell backquote
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
:: >
output=`mycmd myarg`
==>
output = Popen(["mycmd", "myarg"], stdout=PIPE).communicate()[0]
<
Replacing shell pipeline
:: >
output=`dmesg | grep hda`
==>
p1 = Popen(["dmesg"], stdout=PIPE)
p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE)
output = p2.communicate()[0]
<
Replacing os.system
:: >
sts = os.system("mycmd" + " myarg")
==>
p = Popen("mycmd" + " myarg", shell=True)
sts = os.waitpid(p.pid, 0)[1]
<
Notes:
* Calling the program through the shell is usually not required.
* It's easier to look at the returncode attribute than the exit status.
A more realistic example would look like this:: >
try:
retcode = call("mycmd" + " myarg", shell=True)
if retcode < 0:
print >>sys.stderr, "Child was terminated by signal", -retcode
else:
print >>sys.stderr, "Child returned", retcode
except OSError, e:
print >>sys.stderr, "Execution failed:", e
<
Replacing the os.spawn <os.spawnl> family
P_NOWAIT example:: >
pid = os.spawnlp(os.P_NOWAIT, "/bin/mycmd", "mycmd", "myarg")
==>
pid = Popen(["/bin/mycmd", "myarg"]).pid
<
P_WAIT example::
retcode = os.spawnlp(os.P_WAIT, "/bin/mycmd", "mycmd", "myarg")
==>
retcode = call(["/bin/mycmd", "myarg"])
Vector example:: >
os.spawnvp(os.P_NOWAIT, path, args)
==>
Popen([path] + args[1:])
<
Environment example::
os.spawnlpe(os.P_NOWAIT, "/bin/mycmd", "mycmd", "myarg", env)
==>
Popen(["/bin/mycmd", "myarg"], env={"PATH": "/usr/bin"})
Replacing os.popen, os.popen2, os.popen3
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
:: >
pipe = os.popen("cmd", 'r', bufsize)
==>
pipe = Popen("cmd", shell=True, bufsize=bufsize, stdout=PIPE).stdout
<
::
pipe = os.popen("cmd", 'w', bufsize)
==>
pipe = Popen("cmd", shell=True, bufsize=bufsize, stdin=PIPE).stdin
:: >
(child_stdin, child_stdout) = os.popen2("cmd", mode, bufsize)
==>
p = Popen("cmd", shell=True, bufsize=bufsize,
stdin=PIPE, stdout=PIPE, close_fds=True)
(child_stdin, child_stdout) = (p.stdin, p.stdout)
<
::
(child_stdin,
child_stdout,
child_stderr) = os.popen3("cmd", mode, bufsize)
==>
p = Popen("cmd", shell=True, bufsize=bufsize,
stdin=PIPE, stdout=PIPE, stderr=PIPE, close_fds=True)
(child_stdin,
child_stdout,
child_stderr) = (p.stdin, p.stdout, p.stderr)
:: >
(child_stdin, child_stdout_and_stderr) = os.popen4("cmd", mode,
bufsize)
==>
p = Popen("cmd", shell=True, bufsize=bufsize,
stdin=PIPE, stdout=PIPE, stderr=STDOUT, close_fds=True)
(child_stdin, child_stdout_and_stderr) = (p.stdin, p.stdout)
<
On Unix, os.popen2, os.popen3 and os.popen4 also accept a sequence as
the command to execute, in which case arguments will be passed
directly to the program without shell intervention. This usage can be
replaced as follows:: >
(child_stdin, child_stdout) = os.popen2(["/bin/ls", "-l"], mode,
bufsize)
==>
p = Popen(["/bin/ls", "-l"], bufsize=bufsize, stdin=PIPE, stdout=PIPE)
(child_stdin, child_stdout) = (p.stdin, p.stdout)
<
Return code handling translates as follows::
pipe = os.popen("cmd", 'w')
...
rc = pipe.close()
if rc is not None and rc % 256:
print "There were some errors"
==>
process = Popen("cmd", 'w', shell=True, stdin=PIPE)
...
process.stdin.close()
if process.wait() != 0:
print "There were some errors"
Replacing functions from the popen2 (|py2stdlib-popen2|) module
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
:: >
(child_stdout, child_stdin) = popen2.popen2("somestring", bufsize, mode)
==>
p = Popen(["somestring"], shell=True, bufsize=bufsize,
stdin=PIPE, stdout=PIPE, close_fds=True)
(child_stdout, child_stdin) = (p.stdout, p.stdin)
<
On Unix, popen2 also accepts a sequence as the command to execute, in
which case arguments will be passed directly to the program without
shell intervention. This usage can be replaced as follows:: >
(child_stdout, child_stdin) = popen2.popen2(["mycmd", "myarg"], bufsize,
mode)
==>
p = Popen(["mycmd", "myarg"], bufsize=bufsize,
stdin=PIPE, stdout=PIPE, close_fds=True)
(child_stdout, child_stdin) = (p.stdout, p.stdin)
<
popen2.Popen3 and popen2.Popen4 basically work as
subprocess.Popen, except that:
* Popen raises an exception if the execution fails.
{ the }capturestderr{ argument is replaced with the }stderr* argument.
* ``stdin=PIPE`` and ``stdout=PIPE`` must be specified.
* popen2 closes all file descriptors by default, but you have to specify
``close_fds=True`` with Popen.
==============================================================================
*py2stdlib-sunau*
sunau~
:synopsis: Provide an interface to the Sun AU sound format.
The sunau (|py2stdlib-sunau|) module provides a convenient interface to the Sun AU sound
format. Note that this module is interface-compatible with the modules
aifc (|py2stdlib-aifc|) and wave (|py2stdlib-wave|).
An audio file consists of a header followed by the data. The fields of the
header are:
+---------------+-----------------------------------------------+
| Field | Contents |
+===============+===============================================+
| magic word | The four bytes ``.snd``. |
+---------------+-----------------------------------------------+
| header size | Size of the header, including info, in bytes. |
+---------------+-----------------------------------------------+
| data size | Physical size of the data, in bytes. |
+---------------+-----------------------------------------------+
| encoding | Indicates how the audio samples are encoded. |
+---------------+-----------------------------------------------+
| sample rate | The sampling rate. |
+---------------+-----------------------------------------------+
| # of channels | The number of channels in the samples. |
+---------------+-----------------------------------------------+
| info | ASCII string giving a description of the |
| | audio file (padded with null bytes). |
+---------------+-----------------------------------------------+
Apart from the info field, all header fields are 4 bytes in size. They are all
32-bit unsigned integers encoded in big-endian byte order.
The sunau (|py2stdlib-sunau|) module defines the following functions:
open(file, mode)~
If {file} is a string, open the file by that name, otherwise treat it as a
seekable file-like object. {mode} can be any of
``'r'``
Read only mode.
``'w'``
Write only mode.
Note that it does not allow read/write files.
A {mode} of ``'r'`` returns a AU_read object, while a {mode} of ``'w'``
or ``'wb'`` returns a AU_write object.
openfp(file, mode)~
A synonym for .open, maintained for backwards compatibility.
The sunau (|py2stdlib-sunau|) module defines the following exception:
Error~
An error raised when something is impossible because of Sun AU specs or
implementation deficiency.
The sunau (|py2stdlib-sunau|) module defines the following data items:
AUDIO_FILE_MAGIC~
An integer every valid Sun AU file begins with, stored in big-endian form. This
is the string ``.snd`` interpreted as an integer.
AUDIO_FILE_ENCODING_MULAW_8~
AUDIO_FILE_ENCODING_LINEAR_8
AUDIO_FILE_ENCODING_LINEAR_16
AUDIO_FILE_ENCODING_LINEAR_24
AUDIO_FILE_ENCODING_LINEAR_32
AUDIO_FILE_ENCODING_ALAW_8
Values of the encoding field from the AU header which are supported by this
module.
AUDIO_FILE_ENCODING_FLOAT~
AUDIO_FILE_ENCODING_DOUBLE
AUDIO_FILE_ENCODING_ADPCM_G721
AUDIO_FILE_ENCODING_ADPCM_G722
AUDIO_FILE_ENCODING_ADPCM_G723_3
AUDIO_FILE_ENCODING_ADPCM_G723_5
Additional known values of the encoding field from the AU header, but which are
not supported by this module.
AU_read Objects
---------------
AU_read objects, as returned by .open above, have the following methods:
AU_read.close()~
Close the stream, and make the instance unusable. (This is called automatically
on deletion.)
AU_read.getnchannels()~
Returns number of audio channels (1 for mone, 2 for stereo).
AU_read.getsampwidth()~
Returns sample width in bytes.
AU_read.getframerate()~
Returns sampling frequency.
AU_read.getnframes()~
Returns number of audio frames.
AU_read.getcomptype()~
Returns compression type. Supported compression types are ``'ULAW'``, ``'ALAW'``
and ``'NONE'``.
AU_read.getcompname()~
Human-readable version of getcomptype. The supported types have the
respective names ``'CCITT G.711 u-law'``, ``'CCITT G.711 A-law'`` and ``'not
compressed'``.
AU_read.getparams()~
Returns a tuple ``(nchannels, sampwidth, framerate, nframes, comptype,
compname)``, equivalent to output of the get\* methods.
AU_read.readframes(n)~
Reads and returns at most {n} frames of audio, as a string of bytes. The data
will be returned in linear format. If the original data is in u-LAW format, it
will be converted.
AU_read.rewind()~
Rewind the file pointer to the beginning of the audio stream.
The following two methods define a term "position" which is compatible between
them, and is otherwise implementation dependent.
AU_read.setpos(pos)~
Set the file pointer to the specified position. Only values returned from
tell should be used for {pos}.
AU_read.tell()~
Return current file pointer position. Note that the returned value has nothing
to do with the actual position in the file.
The following two functions are defined for compatibility with the aifc (|py2stdlib-aifc|),
and don't do anything interesting.
AU_read.getmarkers()~
Returns ``None``.
AU_read.getmark(id)~
Raise an error.
AU_write Objects
----------------
AU_write objects, as returned by .open above, have the following methods:
AU_write.setnchannels(n)~
Set the number of channels.
AU_write.setsampwidth(n)~
Set the sample width (in bytes.)
AU_write.setframerate(n)~
Set the frame rate.
AU_write.setnframes(n)~
Set the number of frames. This can be later changed, when and if more frames
are written.
AU_write.setcomptype(type, name)~
Set the compression type and description. Only ``'NONE'`` and ``'ULAW'`` are
supported on output.
AU_write.setparams(tuple)~
The {tuple} should be ``(nchannels, sampwidth, framerate, nframes, comptype,
compname)``, with values valid for the set\* methods. Set all
parameters.
AU_write.tell()~
Return current position in the file, with the same disclaimer for the
AU_read.tell and AU_read.setpos methods.
AU_write.writeframesraw(data)~
Write audio frames, without correcting {nframes}.
AU_write.writeframes(data)~
Write audio frames and make sure {nframes} is correct.
AU_write.close()~
Make sure {nframes} is correct, and close the file.
This method is called upon deletion.
Note that it is invalid to set any parameters after calling writeframes
or writeframesraw.
==============================================================================
*py2stdlib-sunaudiodev*
sunaudiodev~
:platform: SunOS
:synopsis: Access to Sun audio hardware.
:deprecated:
2.6~
The sunaudiodev (|py2stdlib-sunaudiodev|) module has been deprecated for removal in Python 3.0.
.. index:: single: u-LAW
This module allows you to access the Sun audio interface. The Sun audio hardware
is capable of recording and playing back audio data in u-LAW format with a
sample rate of 8K per second. A full description can be found in the
audio(7I) manual page.
.. index:: module: SUNAUDIODEV
The module SUNAUDIODEV (|py2stdlib-sunaudiodev^|) defines constants which may be used with this
module.
This module defines the following variables and functions:
error~
This exception is raised on all errors. The argument is a string describing what
went wrong.
open(mode)~
This function opens the audio device and returns a Sun audio device object. This
object can then be used to do I/O on. The {mode} parameter is one of ``'r'`` for
record-only access, ``'w'`` for play-only access, ``'rw'`` for both and
``'control'`` for access to the control device. Since only one process is
allowed to have the recorder or player open at the same time it is a good idea
to open the device only for the activity needed. See audio(7I) for
details.
As per the manpage, this module first looks in the environment variable
``AUDIODEV`` for the base audio device filename. If not found, it falls back to
/dev/audio. The control device is calculated by appending "ctl" to the
base audio device.
Audio Device Objects
--------------------
The audio device objects are returned by .open define the following
methods (except ``control`` objects which only provide getinfo,
setinfo, fileno, and drain):
audio device.close()~
This method explicitly closes the device. It is useful in situations where
deleting the object does not immediately close it since there are other
references to it. A closed device should not be used again.
audio device.fileno()~
Returns the file descriptor associated with the device. This can be used to set
up ``SIGPOLL`` notification, as described below.
audio device.drain()~
This method waits until all pending output is processed and then returns.
Calling this method is often not necessary: destroying the object will
automatically close the audio device and this will do an implicit drain.
audio device.flush()~
This method discards all pending output. It can be used avoid the slow response
to a user's stop request (due to buffering of up to one second of sound).
audio device.getinfo()~
This method retrieves status information like input and output volume, etc. and
returns it in the form of an audio status object. This object has no methods but
it contains a number of attributes describing the current device status. The
names and meanings of the attributes are described in ``<sun/audioio.h>`` and in
the audio(7I) manual page. Member names are slightly different from
their C counterparts: a status object is only a single structure. Members of the
play substructure have ``o_`` prepended to their name and members of
the record structure have ``i_``. So, the C member
play.sample_rate is accessed as o_sample_rate,
record.gain as i_gain and monitor_gain plainly as
monitor_gain.
audio device.ibufcount()~
This method returns the number of samples that are buffered on the recording
side, i.e. the program will not block on a read call of so many samples.
audio device.obufcount()~
This method returns the number of samples buffered on the playback side.
Unfortunately, this number cannot be used to determine a number of samples that
can be written without blocking since the kernel output queue length seems to be
variable.
audio device.read(size)~
This method reads {size} samples from the audio input and returns them as a
Python string. The function blocks until enough data is available.
audio device.setinfo(status)~
This method sets the audio device status parameters. The {status} parameter is
an device status object as returned by getinfo and possibly modified by
the program.
audio device.write(samples)~
Write is passed a Python string containing audio samples to be played. If there
is enough buffer space free it will immediately return, otherwise it will block.
The audio device supports asynchronous notification of various events, through
the SIGPOLL signal. Here's an example of how you might enable this in Python:: >
def handle_sigpoll(signum, frame):
print 'I got a SIGPOLL update'
import fcntl, signal, STROPTS
signal.signal(signal.SIGPOLL, handle_sigpoll)
fcntl.ioctl(audio_obj.fileno(), STROPTS.I_SETSIG, STROPTS.S_MSG)
==============================================================================
*py2stdlib-sunaudiodev^*
SUNAUDIODEV~
:platform: SunOS
:synopsis: Constants for use with sunaudiodev.
:deprecated:
2.6~
The SUNAUDIODEV (|py2stdlib-sunaudiodev^|) module has been deprecated for removal in Python 3.0.
.. index:: module: sunaudiodev
This is a companion module to sunaudiodev (|py2stdlib-sunaudiodev|) which defines useful symbolic
constants like MIN_GAIN, MAX_GAIN, SPEAKER, etc. The
names of the constants are the same names as used in the C include file
``<sun/audioio.h>``, with the leading string ``AUDIO_`` stripped.
==============================================================================
*py2stdlib-symbol*
symbol~
:synopsis: Constants representing internal nodes of the parse tree.
This module provides constants which represent the numeric values of internal
nodes of the parse tree. Unlike most Python constants, these use lower-case
names. Refer to the file Grammar/Grammar in the Python distribution for
the definitions of the names in the context of the language grammar. The
specific numeric values which the names map to may change between Python
versions.
This module also provides one additional data object:
sym_name~
Dictionary mapping the numeric values of the constants defined in this module
back to name strings, allowing more human-readable representation of parse trees
to be generated.
.. seealso::
Module parser (|py2stdlib-parser|)
The second example for the parser (|py2stdlib-parser|) module shows how to use the
symbol (|py2stdlib-symbol|) module.
==============================================================================
*py2stdlib-symtable*
symtable~
:synopsis: Interface to the compiler's internal symbol tables.
Symbol tables are generated by the compiler from AST just before bytecode is
generated. The symbol table is responsible for calculating the scope of every
identifier in the code. symtable (|py2stdlib-symtable|) provides an interface to examine these
tables.
Generating Symbol Tables
------------------------
symtable(code, filename, compile_type)~
Return the toplevel SymbolTable for the Python source {code}.
{filename} is the name of the file containing the code. {compile_type} is
like the {mode} argument to compile.
Examining Symbol Tables
-----------------------
SymbolTable~
A namespace table for a block. The constructor is not public.
get_type()~
Return the type of the symbol table. Possible values are ``'class'``,
``'module'``, and ``'function'``.
get_id()~
Return the table's identifier.
get_name()~
Return the table's name. This is the name of the class if the table is
for a class, the name of the function if the table is for a function, or
``'top'`` if the table is global (get_type returns ``'module'``).
get_lineno()~
Return the number of the first line in the block this table represents.
is_optimized()~
Return ``True`` if the locals in this table can be optimized.
is_nested()~
Return ``True`` if the block is a nested class or function.
has_children()~
Return ``True`` if the block has nested namespaces within it. These can
be obtained with get_children.
has_exec()~
Return ``True`` if the block uses ``exec``.
has_import_star()~
Return ``True`` if the block uses a starred from-import.
get_identifiers()~
Return a list of names of symbols in this table.
lookup(name)~
Lookup {name} in the table and return a Symbol instance.
get_symbols()~
Return a list of Symbol instances for names in the table.
get_children()~
Return a list of the nested symbol tables.
Function~
A namespace for a function or method. This class inherits
SymbolTable.
get_parameters()~
Return a tuple containing names of parameters to this function.
get_locals()~
Return a tuple containing names of locals in this function.
get_globals()~
Return a tuple containing names of globals in this function.
get_frees()~
Return a tuple containing names of free variables in this function.
Class~
A namespace of a class. This class inherits SymbolTable.
get_methods()~
Return a tuple containing the names of methods declared in the class.
Symbol~
An entry in a SymbolTable corresponding to an identifier in the
source. The constructor is not public.
get_name()~
Return the symbol's name.
is_referenced()~
Return ``True`` if the symbol is used in its block.
is_imported()~
Return ``True`` if the symbol is created from an import statement.
is_parameter()~
Return ``True`` if the symbol is a parameter.
is_global()~
Return ``True`` if the symbol is global.
is_declared_global()~
Return ``True`` if the symbol is declared global with a global statement.
is_local()~
Return ``True`` if the symbol is local to its block.
is_free()~
Return ``True`` if the symbol is referenced in its block, but not assigned
to.
is_assigned()~
Return ``True`` if the symbol is assigned to in its block.
is_namespace()~
Return ``True`` if name binding introduces new namespace.
If the name is used as the target of a function or class statement, this
will be true.
For example:: >
>>> table = symtable.symtable("def some_func(): pass", "string", "exec")
>>> table.lookup("some_func").is_namespace()
True
<
Note that a single name can be bound to multiple objects. If the result
is ``True``, the name may also be bound to other objects, like an int or
list, that does not introduce a new namespace.
get_namespaces()~
Return a list of namespaces bound to this name.
get_namespace()~
Return the namespace bound to this name. If more than one namespace is
bound, a ValueError is raised.
==============================================================================
*py2stdlib-sys*
sys~
:synopsis: Access system-specific parameters and functions.
This module provides access to some variables used or maintained by the
interpreter and to functions that interact strongly with the interpreter. It is
always available.
argv~
The list of command line arguments passed to a Python script. ``argv[0]`` is the
script name (it is operating system dependent whether this is a full pathname or
not). If the command was executed using the -c command line option to
the interpreter, ``argv[0]`` is set to the string ``'-c'``. If no script name
was passed to the Python interpreter, ``argv[0]`` is the empty string.
To loop over the standard input, or the list of files given on the
command line, see the fileinput (|py2stdlib-fileinput|) module.
byteorder~
An indicator of the native byte order. This will have the value ``'big'`` on
big-endian (most-significant byte first) platforms, and ``'little'`` on
little-endian (least-significant byte first) platforms.
.. versionadded:: 2.0
subversion~
A triple (repo, branch, version) representing the Subversion information of the
Python interpreter. {repo} is the name of the repository, ``'CPython'``.
{branch} is a string of one of the forms ``'trunk'``, ``'branches/name'`` or
``'tags/name'``. {version} is the output of ``svnversion``, if the interpreter
was built from a Subversion checkout; it contains the revision number (range)
and possibly a trailing 'M' if there were local modifications. If the tree was
exported (or svnversion was not available), it is the revision of
``Include/patchlevel.h`` if the branch is a tag. Otherwise, it is ``None``.
.. versionadded:: 2.5
builtin_module_names~
A tuple of strings giving the names of all modules that are compiled into this
Python interpreter. (This information is not available in any other way ---
``modules.keys()`` only lists the imported modules.)
copyright~
A string containing the copyright pertaining to the Python interpreter.
_clear_type_cache()~
Clear the internal type cache. The type cache is used to speed up attribute
and method lookups. Use the function {only} to drop unnecessary references
during reference leak debugging.
This function should be used for internal and specialized purposes only.
.. versionadded:: 2.6
_current_frames()~
Return a dictionary mapping each thread's identifier to the topmost stack frame
currently active in that thread at the time the function is called. Note that
functions in the traceback (|py2stdlib-traceback|) module can build the call stack given such a
frame.
This is most useful for debugging deadlock: this function does not require the
deadlocked threads' cooperation, and such threads' call stacks are frozen for as
long as they remain deadlocked. The frame returned for a non-deadlocked thread
may bear no relationship to that thread's current activity by the time calling
code examines the frame.
This function should be used for internal and specialized purposes only.
.. versionadded:: 2.5
dllhandle~
Integer specifying the handle of the Python DLL. Availability: Windows.
displayhook(value)~
If {value} is not ``None``, this function prints it to ``sys.stdout``, and saves
it in ``__builtin__._``.
``sys.displayhook`` is called on the result of evaluating an expression
entered in an interactive Python session. The display of these values can be
customized by assigning another one-argument function to ``sys.displayhook``.
excepthook(type, value, traceback)~
This function prints out a given traceback and exception to ``sys.stderr``.
When an exception is raised and uncaught, the interpreter calls
``sys.excepthook`` with three arguments, the exception class, exception
instance, and a traceback object. In an interactive session this happens just
before control is returned to the prompt; in a Python program this happens just
before the program exits. The handling of such top-level exceptions can be
customized by assigning another three-argument function to ``sys.excepthook``.
__displayhook__~
__excepthook__
These objects contain the original values of ``displayhook`` and ``excepthook``
at the start of the program. They are saved so that ``displayhook`` and
``excepthook`` can be restored in case they happen to get replaced with broken
objects.
exc_info()~
This function returns a tuple of three values that give information about the
exception that is currently being handled. The information returned is specific
both to the current thread and to the current stack frame. If the current stack
frame is not handling an exception, the information is taken from the calling
stack frame, or its caller, and so on until a stack frame is found that is
handling an exception. Here, "handling an exception" is defined as "executing
or having executed an except clause." For any stack frame, only information
about the most recently handled exception is accessible.
.. index:: object: traceback
If no exception is being handled anywhere on the stack, a tuple containing three
``None`` values is returned. Otherwise, the values returned are ``(type, value,
traceback)``. Their meaning is: {type} gets the exception type of the exception
being handled (a class object); {value} gets the exception parameter (its
associated value or the second argument to raise, which is
always a class instance if the exception type is a class object); {traceback}
gets a traceback object (see the Reference Manual) which encapsulates the call
stack at the point where the exception originally occurred.
If exc_clear is called, this function will return three ``None`` values
until either another exception is raised in the current thread or the execution
stack returns to a frame where another exception is being handled.
.. warning:: >
Assigning the {traceback} return value to a local variable in a function that is
handling an exception will cause a circular reference. This will prevent
anything referenced by a local variable in the same function or by the traceback
from being garbage collected. Since most functions don't need access to the
traceback, the best solution is to use something like ``exctype, value =
sys.exc_info()[:2]`` to extract only the exception type and value. If you do
need the traceback, make sure to delete it after use (best done with a
try ... finally statement) or to call exc_info in
a function that does not itself handle an exception.
<
.. note::
Beginning with Python 2.2, such cycles are automatically reclaimed when garbage
collection is enabled and they become unreachable, but it remains more efficient
to avoid creating cycles.
exc_clear()~
This function clears all information relating to the current or last exception
that occurred in the current thread. After calling this function,
exc_info will return three ``None`` values until another exception is
raised in the current thread or the execution stack returns to a frame where
another exception is being handled.
This function is only needed in only a few obscure situations. These include
logging and error handling systems that report information on the last or
current exception. This function can also be used to try to free resources and
trigger object finalization, though no guarantee is made as to what objects will
be freed, if any.
.. versionadded:: 2.3
exc_type~
exc_value
exc_traceback
1.5~
Use exc_info instead.
Since they are global variables, they are not specific to the current thread, so
their use is not safe in a multi-threaded program. When no exception is being
handled, ``exc_type`` is set to ``None`` and the other two are undefined.
exec_prefix~
A string giving the site-specific directory prefix where the platform-dependent
Python files are installed; by default, this is also ``'/usr/local'``. This can
be set at build time with the --exec-prefix argument to the
configure script. Specifically, all configuration files (e.g. the
pyconfig.h header file) are installed in the directory ``exec_prefix +
'/lib/pythonversion/config'``, and shared library modules are installed in
``exec_prefix + '/lib/pythonversion/lib-dynload'``, where {version} is equal to
``version[:3]``.
executable~
A string giving the name of the executable binary for the Python interpreter, on
systems where this makes sense.
exit([arg])~
Exit from Python. This is implemented by raising the SystemExit
exception, so cleanup actions specified by finally clauses of try
statements are honored, and it is possible to intercept the exit attempt at an
outer level. The optional argument {arg} can be an integer giving the exit
status (defaulting to zero), or another type of object. If it is an integer,
zero is considered "successful termination" and any nonzero value is considered
"abnormal termination" by shells and the like. Most systems require it to be in
the range 0-127, and produce undefined results otherwise. Some systems have a
convention for assigning specific meanings to specific exit codes, but these are
generally underdeveloped; Unix programs generally use 2 for command line syntax
errors and 1 for all other kind of errors. If another type of object is passed,
``None`` is equivalent to passing zero, and any other object is printed to
``sys.stderr`` and results in an exit code of 1. In particular,
``sys.exit("some error message")`` is a quick way to exit a program when an
error occurs.
exitfunc~
This value is not actually defined by the module, but can be set by the user (or
by a program) to specify a clean-up action at program exit. When set, it should
be a parameterless function. This function will be called when the interpreter
exits. Only one function may be installed in this way; to allow multiple
functions which will be called at termination, use the atexit (|py2stdlib-atexit|) module.
.. note:: >
The exit function is not called when the program is killed by a signal, when a
Python fatal internal error is detected, or when ``os._exit()`` is called.
<
2.4~
Use atexit (|py2stdlib-atexit|) instead.
flags~
The struct sequence {flags} exposes the status of command line flags. The
attributes are read only.
+------------------------------+------------------------------------------+
| attribute | flag |
+==============================+==========================================+
| debug | -d |
+------------------------------+------------------------------------------+
| py3k_warning | -3 |
+------------------------------+------------------------------------------+
| division_warning | -Q |
+------------------------------+------------------------------------------+
| division_new | -Qnew |
+------------------------------+------------------------------------------+
| inspect (|py2stdlib-inspect|) | -i |
+------------------------------+------------------------------------------+
| interactive | -i |
+------------------------------+------------------------------------------+
| optimize | -O or -OO |
+------------------------------+------------------------------------------+
| dont_write_bytecode | -B |
+------------------------------+------------------------------------------+
| no_user_site | -s |
+------------------------------+------------------------------------------+
| no_site | -S |
+------------------------------+------------------------------------------+
| ignore_environment | -E |
+------------------------------+------------------------------------------+
| tabcheck | -t or -tt |
+------------------------------+------------------------------------------+
| verbose | -v |
+------------------------------+------------------------------------------+
| unicode | -U |
+------------------------------+------------------------------------------+
| bytes_warning | -b |
+------------------------------+------------------------------------------+
.. versionadded:: 2.6
float_info~
A structseq holding information about the float type. It contains low level
information about the precision and internal representation. The values
correspond to the various floating-point constants defined in the standard
header file float.h for the 'C' programming language; see section
5.2.4.2.2 of the 1999 ISO/IEC C standard [C99]_, 'Characteristics of
floating types', for details.
+---------------------+----------------+--------------------------------------------------+
| attribute | float.h macro | explanation |
+=====================+================+==================================================+
| epsilon | DBL_EPSILON | difference between 1 and the least value greater |
| | | than 1 that is representable as a float |
+---------------------+----------------+--------------------------------------------------+
| dig | DBL_DIG | maximum number of decimal digits that can be |
| | | faithfully represented in a float; see below |
+---------------------+----------------+--------------------------------------------------+
| mant_dig | DBL_MANT_DIG | float precision: the number of base-``radix`` |
| | | digits in the significand of a float |
+---------------------+----------------+--------------------------------------------------+
| max | DBL_MAX | maximum representable finite float |
+---------------------+----------------+--------------------------------------------------+
| max_exp | DBL_MAX_EXP | maximum integer e such that ``radix{}(e-1)`` is |
| | | a representable finite float |
+---------------------+----------------+--------------------------------------------------+
| max_10_exp | DBL_MAX_10_EXP | maximum integer e such that ``10{}e`` is in the |
| | | range of representable finite floats |
+---------------------+----------------+--------------------------------------------------+
| min | DBL_MIN | minimum positive normalized float |
+---------------------+----------------+--------------------------------------------------+
| min_exp | DBL_MIN_EXP | minimum integer e such that ``radix{}(e-1)`` is |
| | | a normalized float |
+---------------------+----------------+--------------------------------------------------+
| min_10_exp | DBL_MIN_10_EXP | minimum integer e such that ``10{}e`` is a |
| | | normalized float |
+---------------------+----------------+--------------------------------------------------+
| radix | FLT_RADIX | radix of exponent representation |
+---------------------+----------------+--------------------------------------------------+
| rounds | FLT_ROUNDS | constant representing rounding mode |
| | | used for arithmetic operations |
+---------------------+----------------+--------------------------------------------------+
The attribute sys.float_info.dig needs further explanation. If
``s`` is any string representing a decimal number with at most
sys.float_info.dig significant digits, then converting ``s`` to a
float and back again will recover a string representing the same decimal
value:: >
>>> import sys
>>> sys.float_info.dig
15
>>> s = '3.14159265358979' # decimal string with 15 significant digits
>>> format(float(s), '.15g') # convert to float and back -> same value
'3.14159265358979'
<
But for strings with more than sys.float_info.dig significant digits,
this isn't always true:: >
>>> s = '9876543211234567' # 16 significant digits is too many!
>>> format(float(s), '.16g') # conversion changes value
'9876543211234568'
<
.. versionadded:: 2.6
float_repr_style~
A string indicating how the repr (|py2stdlib-repr|) function behaves for
floats. If the string has value ``'short'`` then for a finite
float ``x``, ``repr(x)`` aims to produce a short string with the
property that ``float(repr(x)) == x``. This is the usual behaviour
in Python 2.7 and later. Otherwise, ``float_repr_style`` has value
``'legacy'`` and ``repr(x)`` behaves in the same way as it did in
versions of Python prior to 2.7.
.. versionadded:: 2.7
getcheckinterval()~
Return the interpreter's "check interval"; see setcheckinterval.
.. versionadded:: 2.3
getdefaultencoding()~
Return the name of the current default string encoding used by the Unicode
implementation.
.. versionadded:: 2.0
getdlopenflags()~
Return the current value of the flags that are used for dlopen calls.
The flag constants are defined in the dl (|py2stdlib-dl|) and DLFCN modules.
Availability: Unix.
.. versionadded:: 2.2
getfilesystemencoding()~
Return the name of the encoding used to convert Unicode filenames into system
file names, or ``None`` if the system default encoding is used. The result value
depends on the operating system:
* On Mac OS X, the encoding is ``'utf-8'``.
* On Unix, the encoding is the user's preference according to the result of
nl_langinfo(CODESET), or ``None`` if the ``nl_langinfo(CODESET)``
failed.
* On Windows NT+, file names are Unicode natively, so no conversion is
performed. getfilesystemencoding still returns ``'mbcs'``, as
this is the encoding that applications should use when they explicitly
want to convert Unicode strings to byte strings that are equivalent when
used as file names.
* On Windows 9x, the encoding is ``'mbcs'``.
.. versionadded:: 2.3
getrefcount(object)~
Return the reference count of the {object}. The count returned is generally one
higher than you might expect, because it includes the (temporary) reference as
an argument to getrefcount.
getrecursionlimit()~
Return the current value of the recursion limit, the maximum depth of the Python
interpreter stack. This limit prevents infinite recursion from causing an
overflow of the C stack and crashing Python. It can be set by
setrecursionlimit.
getsizeof(object[, default])~
Return the size of an object in bytes. The object can be any type of
object. All built-in objects will return correct results, but this
does not have to hold true for third-party extensions as it is implementation
specific.
If given, {default} will be returned if the object does not provide means to
retrieve the size. Otherwise a TypeError will be raised.
getsizeof calls the object's ``__sizeof__`` method and adds an
additional garbage collector overhead if the object is managed by the garbage
collector.
.. versionadded:: 2.6
_getframe([depth])~
Return a frame object from the call stack. If optional integer {depth} is
given, return the frame object that many calls below the top of the stack. If
that is deeper than the call stack, ValueError is raised. The default
for {depth} is zero, returning the frame at the top of the call stack.
.. impl-detail:: >
This function should be used for internal and specialized purposes only.
It is not guaranteed to exist in all implementations of Python.
<
getprofile()~
.. index::
single: profile function
single: profiler
Get the profiler function as set by setprofile.
.. versionadded:: 2.6
gettrace()~
.. index::
single: trace function
single: debugger
Get the trace function as set by settrace.
.. impl-detail:: >
The gettrace function is intended only for implementing debuggers,
profilers, coverage tools and the like. Its behavior is part of the
implementation platform, rather than part of the language definition, and
thus may not be available in all Python implementations.
<
.. versionadded:: 2.6
getwindowsversion()~
Return a named tuple describing the Windows version
currently running. The named elements are {major}, {minor},
{build}, {platform}, {service_pack}, {service_pack_minor},
{service_pack_major}, {suite_mask}, and {product_type}.
{service_pack} contains a string while all other values are
integers. The components can also be accessed by name, so
``sys.getwindowsversion()[0]`` is equivalent to
``sys.getwindowsversion().major``. For compatibility with prior
versions, only the first 5 elements are retrievable by indexing.
{platform} may be one of the following values:
+-----------------------------------------+-------------------------+
| Constant | Platform |
+=========================================+=========================+
| 0 (VER_PLATFORM_WIN32s) | Win32s on Windows 3.1 |
+-----------------------------------------+-------------------------+
| 1 (VER_PLATFORM_WIN32_WINDOWS) | Windows 95/98/ME |
+-----------------------------------------+-------------------------+
| 2 (VER_PLATFORM_WIN32_NT) | Windows NT/2000/XP/x64 |
+-----------------------------------------+-------------------------+
| 3 (VER_PLATFORM_WIN32_CE) | Windows CE |
+-----------------------------------------+-------------------------+
{product_type} may be one of the following values:
+---------------------------------------+---------------------------------+
| Constant | Meaning |
+=======================================+=================================+
| 1 (VER_NT_WORKSTATION) | The system is a workstation. |
+---------------------------------------+---------------------------------+
| 2 (VER_NT_DOMAIN_CONTROLLER) | The system is a domain |
| | controller. |
+---------------------------------------+---------------------------------+
| 3 (VER_NT_SERVER) | The system is a server, but not |
| | a domain controller. |
+---------------------------------------+---------------------------------+
This function wraps the Win32 GetVersionEx function; see the
Microsoft documentation on OSVERSIONINFOEX for more information
about these fields.
Availability: Windows.
.. versionadded:: 2.3
.. versionchanged:: 2.7
Changed to a named tuple and added {service_pack_minor},
{service_pack_major}, {suite_mask}, and {product_type}.
hexversion~
The version number encoded as a single integer. This is guaranteed to increase
with each version, including proper support for non-production releases. For
example, to test that the Python interpreter is at least version 1.5.2, use:: >
if sys.hexversion >= 0x010502F0:
# use some advanced feature
...
else:
# use an alternative implementation or warn the user
...
<
This is called ``hexversion`` since it only really looks meaningful when viewed
as the result of passing it to the built-in hex function. The
``version_info`` value may be used for a more human-friendly encoding of the
same information.
.. versionadded:: 1.5.2
long_info~
A struct sequence that holds information about Python's
internal representation of integers. The attributes are read only.
+-------------------------+----------------------------------------------+
| attribute | explanation |
+=========================+==============================================+
| bits_per_digit | number of bits held in each digit. Python |
| | integers are stored internally in base |
| | ``2{}long_info.bits_per_digit`` |
+-------------------------+----------------------------------------------+
| sizeof_digit | size in bytes of the C type used to |
| | represent a digit |
+-------------------------+----------------------------------------------+
.. versionadded:: 2.7
last_type~
last_value
last_traceback
These three variables are not always defined; they are set when an exception is
not handled and the interpreter prints an error message and a stack traceback.
Their intended use is to allow an interactive user to import a debugger module
and engage in post-mortem debugging without having to re-execute the command
that caused the error. (Typical use is ``import pdb; pdb.pm()`` to enter the
post-mortem debugger; see chapter debugger for
more information.)
The meaning of the variables is the same as that of the return values from
exc_info above. (Since there is only one interactive thread,
thread-safety is not a concern for these variables, unlike for ``exc_type``
etc.)
maxint~
The largest positive integer supported by Python's regular integer type. This
is at least 2\{\}31-1. The largest negative integer is ``-maxint-1`` --- the
asymmetry results from the use of 2's complement binary arithmetic.
maxsize~
The largest positive integer supported by the platform's Py_ssize_t type,
and thus the maximum size lists, strings, dicts, and many other containers
can have.
maxunicode~
An integer giving the largest supported code point for a Unicode character. The
value of this depends on the configuration option that specifies whether Unicode
characters are stored as UCS-2 or UCS-4.
meta_path~
A list of finder objects that have their find_module
methods called to see if one of the objects can find the module to be
imported. The find_module method is called at least with the
absolute name of the module being imported. If the module to be imported is
contained in package then the parent package's __path__ attribute
is passed in as a second argument. The method returns None if
the module cannot be found, else returns a loader.
sys.meta_path is searched before any implicit default finders or
sys.path.
See 302 for the original specification.
modules~
.. index:: builtin: reload
This is a dictionary that maps module names to modules which have already been
loaded. This can be manipulated to force reloading of modules and other tricks.
Note that removing a module from this dictionary is {not} the same as calling
reload on the corresponding module object.
path~
.. index:: triple: module; search; path
A list of strings that specifies the search path for modules. Initialized from
the environment variable PYTHONPATH, plus an installation-dependent
default.
As initialized upon program startup, the first item of this list, ``path[0]``,
is the directory containing the script that was used to invoke the Python
interpreter. If the script directory is not available (e.g. if the interpreter
is invoked interactively or if the script is read from standard input),
``path[0]`` is the empty string, which directs Python to search modules in the
current directory first. Notice that the script directory is inserted {before}
the entries inserted as a result of PYTHONPATH.
A program is free to modify this list for its own purposes.
.. versionchanged:: 2.3
Unicode strings are no longer ignored.
.. seealso::
Module site (|py2stdlib-site|) This describes how to use .pth files to extend
sys.path.
path_hooks~
A list of callables that take a path argument to try to create a
finder for the path. If a finder can be created, it is to be
returned by the callable, else raise ImportError.
Originally specified in 302.
path_importer_cache~
A dictionary acting as a cache for finder objects. The keys are
paths that have been passed to sys.path_hooks and the values are
the finders that are found. If a path is a valid file system path but no
explicit finder is found on sys.path_hooks then None is
stored to represent the implicit default finder should be used. If the path
is not an existing path then imp.NullImporter is set.
Originally specified in 302.
platform~
This string contains a platform identifier that can be used to append
platform-specific components to sys.path, for instance.
For Unix systems, this is the lowercased OS name as returned by ``uname -s``
with the first part of the version as returned by ``uname -r`` appended,
e.g. ``'sunos5'`` or ``'linux2'``, {at the time when Python was built}.
For other systems, the values are:
================ ===========================
System platform (|py2stdlib-platform|) value
================ ===========================
Windows ``'win32'``
Windows/Cygwin ``'cygwin'``
Mac OS X ``'darwin'``
OS/2 ``'os2'``
OS/2 EMX ``'os2emx'``
RiscOS ``'riscos'``
AtheOS ``'atheos'``
================ ===========================
prefix~
A string giving the site-specific directory prefix where the platform
independent Python files are installed; by default, this is the string
``'/usr/local'``. This can be set at build time with the --prefix
argument to the configure script. The main collection of Python
library modules is installed in the directory ``prefix + '/lib/pythonversion'``
while the platform independent header files (all except pyconfig.h) are
stored in ``prefix + '/include/pythonversion'``, where {version} is equal to
``version[:3]``.
ps1~
ps2
.. index::
single: interpreter prompts
single: prompts, interpreter
Strings specifying the primary and secondary prompt of the interpreter. These
are only defined if the interpreter is in interactive mode. Their initial
values in this case are ``'>>> '`` and ``'... '``. If a non-string object is
assigned to either variable, its str is re-evaluated each time the
interpreter prepares to read a new interactive command; this can be used to
implement a dynamic prompt.
py3kwarning~
Bool containing the status of the Python 3.0 warning flag. It's ``True``
when Python is started with the -3 option. (This should be considered
read-only; setting it to a different value doesn't have an effect on
Python 3.0 warnings.)
.. versionadded:: 2.6
dont_write_bytecode~
If this is true, Python won't try to write ``.pyc`` or ``.pyo`` files on the
import of source modules. This value is initially set to ``True`` or ``False``
depending on the ``-B`` command line option and the ``PYTHONDONTWRITEBYTECODE``
environment variable, but you can set it yourself to control bytecode file
generation.
.. versionadded:: 2.6
setcheckinterval(interval)~
Set the interpreter's "check interval". This integer value determines how often
the interpreter checks for periodic things such as thread switches and signal
handlers. The default is ``100``, meaning the check is performed every 100
Python virtual instructions. Setting it to a larger value may increase
performance for programs using threads. Setting it to a value ``<=`` 0 checks
every virtual instruction, maximizing responsiveness as well as overhead.
setdefaultencoding(name)~
Set the current default string encoding used by the Unicode implementation. If
{name} does not match any available encoding, LookupError is raised.
This function is only intended to be used by the site (|py2stdlib-site|) module
implementation and, where needed, by sitecustomize. Once used by the
site (|py2stdlib-site|) module, it is removed from the sys (|py2stdlib-sys|) module's namespace.
.. Note that site (|py2stdlib-site|) is not imported if the -S option is passed
to the interpreter, in which case this function will remain available.
.. versionadded:: 2.0
setdlopenflags(n)~
Set the flags used by the interpreter for dlopen calls, such as when
the interpreter loads extension modules. Among other things, this will enable a
lazy resolving of symbols when importing a module, if called as
``sys.setdlopenflags(0)``. To share symbols across extension modules, call as
``sys.setdlopenflags(dl.RTLD_NOW | dl.RTLD_GLOBAL)``. Symbolic names for the
flag modules can be either found in the dl (|py2stdlib-dl|) module, or in the DLFCN
module. If DLFCN is not available, it can be generated from
/usr/include/dlfcn.h using the h2py script. Availability:
Unix.
.. versionadded:: 2.2
setprofile(profilefunc)~
.. index::
single: profile function
single: profiler
Set the system's profile function, which allows you to implement a Python source
code profiler in Python. See chapter profile (|py2stdlib-profile|) for more information on the
Python profiler. The system's profile function is called similarly to the
system's trace function (see settrace), but it isn't called for each
executed line of code (only on call and return, but the return event is reported
even when an exception has been set). The function is thread-specific, but
there is no way for the profiler to know about context switches between threads,
so it does not make sense to use this in the presence of multiple threads. Also,
its return value is not used, so it can simply return ``None``.
setrecursionlimit(limit)~
Set the maximum depth of the Python interpreter stack to {limit}. This limit
prevents infinite recursion from causing an overflow of the C stack and crashing
Python.
The highest possible limit is platform-dependent. A user may need to set the
limit higher when she has a program that requires deep recursion and a platform
that supports a higher limit. This should be done with care, because a too-high
limit can lead to a crash.
settrace(tracefunc)~
.. index::
single: trace function
single: debugger
Set the system's trace function, which allows you to implement a Python
source code debugger in Python. The function is thread-specific; for a
debugger to support multiple threads, it must be registered using
settrace for each thread being debugged.
Trace functions should have three arguments: {frame}, {event}, and
{arg}. {frame} is the current stack frame. {event} is a string: ``'call'``,
``'line'``, ``'return'``, ``'exception'``, ``'c_call'``, ``'c_return'``, or
``'c_exception'``. {arg} depends on the event type.
The trace function is invoked (with {event} set to ``'call'``) whenever a new
local scope is entered; it should return a reference to a local trace
function to be used that scope, or ``None`` if the scope shouldn't be traced.
The local trace function should return a reference to itself (or to another
function for further tracing in that scope), or ``None`` to turn off tracing
in that scope.
The events have the following meaning:
``'call'``
A function is called (or some other code block entered). The
global trace function is called; {arg} is ``None``; the return value
specifies the local trace function.
``'line'``
The interpreter is about to execute a new line of code or re-execute the
condition of a loop. The local trace function is called; {arg} is
``None``; the return value specifies the new local trace function. See
Objects/lnotab_notes.txt for a detailed explanation of how this
works.
``'return'``
A function (or other code block) is about to return. The local trace
function is called; {arg} is the value that will be returned. The trace
function's return value is ignored.
``'exception'``
An exception has occurred. The local trace function is called; {arg} is a
tuple ``(exception, value, traceback)``; the return value specifies the
new local trace function.
``'c_call'``
A C function is about to be called. This may be an extension function or
a built-in. {arg} is the C function object.
``'c_return'``
A C function has returned. {arg} is ``None``.
``'c_exception'``
A C function has thrown an exception. {arg} is ``None``.
Note that as an exception is propagated down the chain of callers, an
``'exception'`` event is generated at each level.
For more information on code and frame objects, refer to types (|py2stdlib-types|).
.. impl-detail:: >
The settrace function is intended only for implementing debuggers,
profilers, coverage tools and the like. Its behavior is part of the
implementation platform, rather than part of the language definition, and
thus may not be available in all Python implementations.
<
settscdump(on_flag)~
Activate dumping of VM measurements using the Pentium timestamp counter, if
{on_flag} is true. Deactivate these dumps if {on_flag} is off. The function is
available only if Python was compiled with --with-tsc. To understand
the output of this dump, read Python/ceval.c in the Python sources.
.. versionadded:: 2.4
.. impl-detail:: >
This function is intimately bound to CPython implementation details and
thus not likely to be implemented elsewhere.
<
stdin~
stdout
stderr
.. index::
builtin: input
builtin: raw_input
File objects corresponding to the interpreter's standard input, output and error
streams. ``stdin`` is used for all interpreter input except for scripts but
including calls to input and raw_input. ``stdout`` is used for
the output of print and expression statements and for the
prompts of input and raw_input. The interpreter's own prompts
and (almost all of) its error messages go to ``stderr``. ``stdout`` and
``stderr`` needn't be built-in file objects: any object is acceptable as long
as it has a write method that takes a string argument. (Changing these
objects doesn't affect the standard I/O streams of processes executed by
os.popen, os.system or the exec\* family of functions in
the os (|py2stdlib-os|) module.)
__stdin__~
__stdout__
__stderr__
These objects contain the original values of ``stdin``, ``stderr`` and
``stdout`` at the start of the program. They are used during finalization,
and could be useful to print to the actual standard stream no matter if the
``sys.std*`` object has been redirected.
It can also be used to restore the actual files to known working file objects
in case they have been overwritten with a broken object. However, the
preferred way to do this is to explicitly save the previous stream before
replacing it, and restore the saved object.
tracebacklimit~
When this variable is set to an integer value, it determines the maximum number
of levels of traceback information printed when an unhandled exception occurs.
The default is ``1000``. When set to ``0`` or less, all traceback information
is suppressed and only the exception type and value are printed.
version~
A string containing the version number of the Python interpreter plus additional
information on the build number and compiler used. It has a value of the form
``'version (#build_number, build_date, build_time) [compiler]'``. The first
three characters are used to identify the version in the installation
directories (where appropriate on each platform). An example:: >
>>> import sys
>>> sys.version
'1.5.2 (#0 Apr 13 1999, 10:51:12) [MSC 32 bit (Intel)]'
<
api_version~
The C API version for this interpreter. Programmers may find this useful when
debugging version conflicts between Python and extension modules.
.. versionadded:: 2.3
version_info~
A tuple containing the five components of the version number: {major}, {minor},
{micro}, {releaselevel}, and {serial}. All values except {releaselevel} are
integers; the release level is ``'alpha'``, ``'beta'``, ``'candidate'``, or
``'final'``. The ``version_info`` value corresponding to the Python version 2.0
is ``(2, 0, 0, 'final', 0)``. The components can also be accessed by name,
so ``sys.version_info[0]`` is equivalent to ``sys.version_info.major``
and so on.
.. versionadded:: 2.0
.. versionchanged:: 2.7
Added named component attributes
warnoptions~
This is an implementation detail of the warnings framework; do not modify this
value. Refer to the warnings (|py2stdlib-warnings|) module for more information on the warnings
framework.
winver~
The version number used to form registry keys on Windows platforms. This is
stored as string resource 1000 in the Python DLL. The value is normally the
first three characters of version. It is provided in the sys (|py2stdlib-sys|)
module for informational purposes; modifying this value has no effect on the
registry keys used by Python. Availability: Windows.
.. rubric:: Citations
.. [C99] ISO/IEC 9899:1999. "Programming languages -- C." A public draft of this standard is available at http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf .
==============================================================================
*py2stdlib-sysconfig*
sysconfig~
:synopsis: Python's configuration information
.. versionadded:: 2.7
.. index::
single: configuration information
The sysconfig (|py2stdlib-sysconfig|) module provides access to Python's configuration
information like the list of installation paths and the configuration variables
relevant for the current platform.
Configuration variables
-----------------------
A Python distribution contains a Makefile and a pyconfig.h
header file that are necessary to build both the Python binary itself and
third-party C extensions compiled using distutils (|py2stdlib-distutils|).
sysconfig (|py2stdlib-sysconfig|) puts all variables found in these files in a dictionary that
can be accessed using get_config_vars or get_config_var.
Notice that on Windows, it's a much smaller set.
get_config_vars(\*args)~
With no arguments, return a dictionary of all configuration variables
relevant for the current platform.
With arguments, return a list of values that result from looking up each
argument in the configuration variable dictionary.
For each argument, if the value is not found, return ``None``.
get_config_var(name)~
Return the value of a single variable {name}. Equivalent to
``get_config_vars().get(name)``.
If {name} is not found, return ``None``.
Example of usage:: >
>>> import sysconfig
>>> sysconfig.get_config_var('Py_ENABLE_SHARED')
0
>>> sysconfig.get_config_var('LIBDIR')
'/usr/local/lib'
>>> sysconfig.get_config_vars('AR', 'CXX')
['ar', 'g++']
<
Installation paths
Python uses an installation scheme that differs depending on the platform and on
the installation options. These schemes are stored in sysconfig (|py2stdlib-sysconfig|) under
unique identifiers based on the value returned by os.name.
Every new component that is installed using distutils (|py2stdlib-distutils|) or a
Distutils-based system will follow the same scheme to copy its file in the right
places.
Python currently supports seven schemes:
- {posix_prefix}: scheme for Posix platforms like Linux or Mac OS X. This is
the default scheme used when Python or a component is installed.
- {posix_home}: scheme for Posix platforms used when a {home} option is used
upon installation. This scheme is used when a component is installed through
Distutils with a specific home prefix.
- {posix_user}: scheme for Posix platforms used when a component is installed
through Distutils and the {user} option is used. This scheme defines paths
located under the user home directory.
- {nt}: scheme for NT platforms like Windows.
- {nt_user}: scheme for NT platforms, when the {user} option is used.
- {os2}: scheme for OS/2 platforms.
- {os2_home}: scheme for OS/2 patforms, when the {user} option is used.
Each scheme is itself composed of a series of paths and each path has a unique
identifier. Python currently uses eight paths:
- {stdlib}: directory containing the standard Python library files that are not
platform-specific.
- {platstdlib}: directory containing the standard Python library files that are
platform-specific.
- {platlib}: directory for site-specific, platform-specific files.
- {purelib}: directory for site-specific, non-platform-specific files.
- {include}: directory for non-platform-specific header files.
- {platinclude}: directory for platform-specific header files.
- {scripts}: directory for script files.
- {data}: directory for data files.
sysconfig (|py2stdlib-sysconfig|) provides some functions to determine these paths.
get_scheme_names()~
Return a tuple containing all schemes currently supported in
sysconfig (|py2stdlib-sysconfig|).
get_path_names()~
Return a tuple containing all path names currently supported in
sysconfig (|py2stdlib-sysconfig|).
get_path(name, [scheme, [vars, [expand]]])~
Return an installation path corresponding to the path {name}, from the
install scheme named {scheme}.
{name} has to be a value from the list returned by get_path_names.
sysconfig (|py2stdlib-sysconfig|) stores installation paths corresponding to each path name,
for each platform, with variables to be expanded. For instance the {stdlib}
path for the {nt} scheme is: ``{base}/Lib``.
get_path will use the variables returned by get_config_vars
to expand the path. All variables have default values for each platform so
one may call this function and get the default value.
If {scheme} is provided, it must be a value from the list returned by
get_path_names. Otherwise, the default scheme for the current
platform is used.
If {vars} is provided, it must be a dictionary of variables that will update
the dictionary return by get_config_vars.
If {expand} is set to ``False``, the path will not be expanded using the
variables.
If {name} is not found, return ``None``.
get_paths([scheme, [vars, [expand]]])~
Return a dictionary containing all installation paths corresponding to an
installation scheme. See get_path for more information.
If {scheme} is not provided, will use the default scheme for the current
platform.
If {vars} is provided, it must be a dictionary of variables that will
update the dictionary used to expand the paths.
If {expand} is set to False, the paths will not be expanded.
If {scheme} is not an existing scheme, get_paths will raise a
KeyError.
Other functions
---------------
get_python_version()~
Return the ``MAJOR.MINOR`` Python version number as a string. Similar to
``sys.version[:3]``.
get_platform()~
Return a string that identifies the current platform.
This is used mainly to distinguish platform-specific build directories and
platform-specific built distributions. Typically includes the OS name and
version and the architecture (as supplied by os.uname), although the
exact information included depends on the OS; e.g. for IRIX the architecture
isn't particularly important (IRIX only runs on SGI hardware), but for Linux
the kernel version isn't particularly important.
Examples of returned values:
- linux-i586
- linux-alpha (?)
- solaris-2.6-sun4u
- irix-5.3
- irix64-6.2
Windows will return one of:
- win-amd64 (64bit Windows on AMD64 (aka x86_64, Intel64, EM64T, etc)
- win-ia64 (64bit Windows on Itanium)
- win32 (all others - specifically, sys.platform is returned)
Mac OS X can return:
- macosx-10.6-ppc
- macosx-10.4-ppc64
- macosx-10.3-i386
- macosx-10.4-fat
For other non-POSIX platforms, currently just returns sys.platform.
is_python_build()~
Return ``True`` if the current Python installation was built from source.
parse_config_h(fp[, vars])~
Parse a config.h\-style file.
{fp} is a file-like object pointing to the config.h\-like file.
A dictionary containing name/value pairs is returned. If an optional
dictionary is passed in as the second argument, it is used instead of a new
dictionary, and updated with the values read in the file.
get_config_h_filename()~
Return the path of pyconfig.h.
==============================================================================
*py2stdlib-syslog*
syslog~
:platform: Unix
:synopsis: An interface to the Unix syslog library routines.
This module provides an interface to the Unix ``syslog`` library routines.
Refer to the Unix manual pages for a detailed description of the ``syslog``
facility.
This module wraps the system ``syslog`` family of routines. A pure Python
library that can speak to a syslog server is available in the
logging.handlers module as SysLogHandler.
The module defines the following functions:
syslog([priority,] message)~
Send the string {message} to the system logger. A trailing newline is added
if necessary. Each message is tagged with a priority composed of a
{facility} and a {level}. The optional {priority} argument, which defaults
to LOG_INFO, determines the message priority. If the facility is
not encoded in {priority} using logical-or (``LOG_INFO | LOG_USER``), the
value given in the openlog call is used.
If openlog has not been called prior to the call to syslog (|py2stdlib-syslog|),
``openlog()`` will be called with no arguments.
openlog([ident[, logopt[, facility]]])~
Logging options of subsequent syslog (|py2stdlib-syslog|) calls can be set by calling
openlog. syslog (|py2stdlib-syslog|) will call openlog with no arguments
if the log is not currently open.
The optional {ident} keyword argument is a string which is prepended to every
message, and defaults to ``sys.argv[0]`` with leading path components
stripped. The optional {logopt} keyword argument (default is 0) is a bit
field -- see below for possible values to combine. The optional {facility}
keyword argument (default is LOG_USER) sets the default facility for
messages which do not have a facility explicitly encoded.
.. versionchanged:: 3.2
In previous versions, keyword arguments were not allowed, and {ident} was
required. The default for {ident} was dependent on the system libraries,
and often was ``python`` instead of the name of the python program file.
closelog()~
Reset the syslog module values and call the system library ``closelog()``.
This causes the module to behave as it does when initially imported. For
example, openlog will be called on the first syslog (|py2stdlib-syslog|) call (if
openlog hasn't already been called), and {ident} and other
openlog parameters are reset to defaults.
setlogmask(maskpri)~
Set the priority mask to {maskpri} and return the previous mask value. Calls
to syslog (|py2stdlib-syslog|) with a priority level not set in {maskpri} are ignored.
The default is to log all priorities. The function ``LOG_MASK(pri)``
calculates the mask for the individual priority {pri}. The function
``LOG_UPTO(pri)`` calculates the mask for all priorities up to and including
{pri}.
The module defines the following constants:
Priority levels (high to low):
LOG_EMERG, LOG_ALERT, LOG_CRIT, LOG_ERR,
LOG_WARNING, LOG_NOTICE, LOG_INFO,
LOG_DEBUG.
Facilities:
LOG_KERN, LOG_USER, LOG_MAIL, LOG_DAEMON,
LOG_AUTH, LOG_LPR, LOG_NEWS, LOG_UUCP,
LOG_CRON and LOG_LOCAL0 to LOG_LOCAL7.
Log options:
LOG_PID, LOG_CONS, LOG_NDELAY, LOG_NOWAIT
and LOG_PERROR if defined in ``<syslog.h>``.
Examples
--------
Simple example
~~~~~~~~~~~~~~
A simple set of examples:: >
import syslog
syslog.syslog('Processing started')
if error:
syslog.syslog(syslog.LOG_ERR, 'Processing started')
<
An example of setting some log options, these would include the process ID in
logged messages, and write the messages to the destination facility used for
mail logging:: >
syslog.openlog(logopt=syslog.LOG_PID, facility=syslog.LOG_MAIL)
syslog.syslog('E-mail processing initiated...')
==============================================================================
*py2stdlib-tabnanny*
tabnanny~
:synopsis: Tool for detecting white space related problems in Python source files in a
directory tree.
.. rudimentary documentation based on module comments
For the time being this module is intended to be called as a script. However it
is possible to import it into an IDE and use the function check
described below.
.. note::
The API provided by this module is likely to change in future releases; such
changes may not be backward compatible.
check(file_or_dir)~
If {file_or_dir} is a directory and not a symbolic link, then recursively
descend the directory tree named by {file_or_dir}, checking all .py
files along the way. If {file_or_dir} is an ordinary Python source file, it is
checked for whitespace related problems. The diagnostic messages are written to
standard output using the print statement.
verbose~
Flag indicating whether to print verbose messages. This is incremented by the
``-v`` option if called as a script.
filename_only~
Flag indicating whether to print only the filenames of files containing
whitespace related problems. This is set to true by the ``-q`` option if called
as a script.
NannyNag~
Raised by tokeneater if detecting an ambiguous indent. Captured and
handled in check.
tokeneater(type, token, start, end, line)~
This function is used by check as a callback parameter to the function
tokenize.tokenize.
.. XXX document errprint, format_witnesses, Whitespace, check_equal, indents,
reset_globals
.. seealso::
Module tokenize (|py2stdlib-tokenize|)
Lexical scanner for Python source code.
==============================================================================
*py2stdlib-tarfile*
tarfile~
:synopsis: Read and write tar-format archive files.
.. versionadded:: 2.3
The tarfile (|py2stdlib-tarfile|) module makes it possible to read and write tar
archives, including those using gzip or bz2 compression.
(.zip files can be read and written using the zipfile (|py2stdlib-zipfile|) module.)
Some facts and figures:
* reads and writes gzip (|py2stdlib-gzip|) and bz2 (|py2stdlib-bz2|) compressed archives.
* read/write support for the POSIX.1-1988 (ustar) format.
{ read/write support for the GNU tar format including }longname{ and }longlink*
extensions, read-only support for the {sparse} extension.
* read/write support for the POSIX.1-2001 (pax) format.
.. versionadded:: 2.6
* handles directories, regular files, hardlinks, symbolic links, fifos,
character devices and block devices and is able to acquire and restore file
information like timestamp, access permissions and owner.
open(name=None, mode='r', fileobj=None, bufsize=10240, \{\}kwargs)~
Return a TarFile object for the pathname {name}. For detailed
information on TarFile objects and the keyword arguments that are
allowed, see tarfile-objects.
{mode} has to be a string of the form ``'filemode[:compression]'``, it defaults
to ``'r'``. Here is a full list of mode combinations:
+------------------+---------------------------------------------+
| mode | action |
+==================+=============================================+
| ``'r' or 'r:*'`` | Open for reading with transparent |
| | compression (recommended). |
+------------------+---------------------------------------------+
| ``'r:'`` | Open for reading exclusively without |
| | compression. |
+------------------+---------------------------------------------+
| ``'r:gz'`` | Open for reading with gzip compression. |
+------------------+---------------------------------------------+
| ``'r:bz2'`` | Open for reading with bzip2 compression. |
+------------------+---------------------------------------------+
| ``'a' or 'a:'`` | Open for appending with no compression. The |
| | file is created if it does not exist. |
+------------------+---------------------------------------------+
| ``'w' or 'w:'`` | Open for uncompressed writing. |
+------------------+---------------------------------------------+
| ``'w:gz'`` | Open for gzip compressed writing. |
+------------------+---------------------------------------------+
| ``'w:bz2'`` | Open for bzip2 compressed writing. |
+------------------+---------------------------------------------+
Note that ``'a:gz'`` or ``'a:bz2'`` is not possible. If {mode} is not suitable
to open a certain (compressed) file for reading, ReadError is raised. Use
{mode} ``'r'`` to avoid this. If a compression method is not supported,
CompressionError is raised.
If {fileobj} is specified, it is used as an alternative to a file object opened
for {name}. It is supposed to be at position 0.
For special purposes, there is a second format for {mode}:
``'filemode|[compression]'``. tarfile.open will return a TarFile
object that processes its data as a stream of blocks. No random seeking will
be done on the file. If given, {fileobj} may be any object that has a
read or write method (depending on the {mode}). {bufsize}
specifies the blocksize and defaults to ``20 * 512`` bytes. Use this variant
in combination with e.g. ``sys.stdin``, a socket file object or a tape
device. However, such a TarFile object is limited in that it does
not allow to be accessed randomly, see tar-examples. The currently
possible modes:
+-------------+--------------------------------------------+
| Mode | Action |
+=============+============================================+
| ``'r|{'`` | Open a }stream* of tar blocks for reading |
| | with transparent compression. |
+-------------+--------------------------------------------+
| ``'r|'`` | Open a {stream} of uncompressed tar blocks |
| | for reading. |
+-------------+--------------------------------------------+
| ``'r|gz'`` | Open a gzip compressed {stream} for |
| | reading. |
+-------------+--------------------------------------------+
| ``'r|bz2'`` | Open a bzip2 compressed {stream} for |
| | reading. |
+-------------+--------------------------------------------+
| ``'w|'`` | Open an uncompressed {stream} for writing. |
+-------------+--------------------------------------------+
| ``'w|gz'`` | Open an gzip compressed {stream} for |
| | writing. |
+-------------+--------------------------------------------+
| ``'w|bz2'`` | Open an bzip2 compressed {stream} for |
| | writing. |
+-------------+--------------------------------------------+
TarFile~
Class for reading and writing tar archives. Do not use this class directly,
better use tarfile.open instead. See tarfile-objects.
is_tarfile(name)~
Return True if {name} is a tar archive file, that the tarfile (|py2stdlib-tarfile|)
module can read.
TarFileCompat(filename, mode='r', compression=TAR_PLAIN)~
Class for limited access to tar archives with a zipfile (|py2stdlib-zipfile|)\ -like interface.
Please consult the documentation of the zipfile (|py2stdlib-zipfile|) module for more details.
{compression} must be one of the following constants:
TAR_PLAIN~
Constant for an uncompressed tar archive.
TAR_GZIPPED~
Constant for a gzip (|py2stdlib-gzip|) compressed tar archive.
2.6~
The TarFileCompat class has been deprecated for removal in Python 3.0.
TarError~
Base class for all tarfile (|py2stdlib-tarfile|) exceptions.
ReadError~
Is raised when a tar archive is opened, that either cannot be handled by the
tarfile (|py2stdlib-tarfile|) module or is somehow invalid.
CompressionError~
Is raised when a compression method is not supported or when the data cannot be
decoded properly.
StreamError~
Is raised for the limitations that are typical for stream-like TarFile
objects.
ExtractError~
Is raised for {non-fatal} errors when using TarFile.extract, but only if
TarFile.errorlevel\ ``== 2``.
HeaderError~
Is raised by TarInfo.frombuf if the buffer it gets is invalid.
.. versionadded:: 2.6
Each of the following constants defines a tar archive format that the
tarfile (|py2stdlib-tarfile|) module is able to create. See section tar-formats for
details.
USTAR_FORMAT~
POSIX.1-1988 (ustar) format.
GNU_FORMAT~
GNU tar format.
PAX_FORMAT~
POSIX.1-2001 (pax) format.
DEFAULT_FORMAT~
The default format for creating archives. This is currently GNU_FORMAT.
The following variables are available on module level:
ENCODING~
The default character encoding i.e. the value from either
sys.getfilesystemencoding or sys.getdefaultencoding.
.. seealso::
Module zipfile (|py2stdlib-zipfile|)
Documentation of the zipfile (|py2stdlib-zipfile|) standard module.
`GNU tar manual, Basic Tar Format <http://www.gnu.org/software/tar/manual/html_node/Standard.html>`_
Documentation for tar archive files, including GNU tar extensions.
TarFile Objects
---------------
The TarFile object provides an interface to a tar archive. A tar
archive is a sequence of blocks. An archive member (a stored file) is made up of
a header block followed by data blocks. It is possible to store a file in a tar
archive several times. Each archive member is represented by a TarInfo
object, see tarinfo-objects for details.
A TarFile object can be used as a context manager in a with
statement. It will automatically be closed when the block is completed. Please
note that in the event of an exception an archive opened for writing will not
be finalized; only the internally used file object will be closed. See the
tar-examples section for a use case.
.. versionadded:: 2.7
Added support for the context manager protocol.
TarFile(name=None, mode='r', fileobj=None, format=DEFAULT_FORMAT, tarinfo=TarInfo, dereference=False, ignore_zeros=False, encoding=ENCODING, errors=None, pax_headers=None, debug=0, errorlevel=0)~
All following arguments are optional and can be accessed as instance attributes
as well.
{name} is the pathname of the archive. It can be omitted if {fileobj} is given.
In this case, the file object's name attribute is used if it exists.
{mode} is either ``'r'`` to read from an existing archive, ``'a'`` to append
data to an existing file or ``'w'`` to create a new file overwriting an existing
one.
If {fileobj} is given, it is used for reading or writing data. If it can be
determined, {mode} is overridden by {fileobj}'s mode. {fileobj} will be used
from position 0.
.. note:: >
{fileobj} is not closed, when TarFile is closed.
<
{format} controls the archive format. It must be one of the constants
USTAR_FORMAT, GNU_FORMAT or PAX_FORMAT that are
defined at module level.
.. versionadded:: 2.6
The {tarinfo} argument can be used to replace the default TarInfo class
with a different one.
.. versionadded:: 2.6
If {dereference} is False, add symbolic and hard links to the archive. If it
is True, add the content of the target files to the archive. This has no
effect on systems that do not support symbolic links.
If {ignore_zeros} is False, treat an empty block as the end of the archive.
If it is True, skip empty (and invalid) blocks and try to get as many members
as possible. This is only useful for reading concatenated or damaged archives.
{debug} can be set from ``0`` (no debug messages) up to ``3`` (all debug
messages). The messages are written to ``sys.stderr``.
If {errorlevel} is ``0``, all errors are ignored when using TarFile.extract.
Nevertheless, they appear as error messages in the debug output, when debugging
is enabled. If ``1``, all {fatal} errors are raised as OSError or
IOError exceptions. If ``2``, all {non-fatal} errors are raised as
TarError exceptions as well.
The {encoding} and {errors} arguments control the way strings are converted to
unicode objects and vice versa. The default settings will work for most users.
See section tar-unicode for in-depth information.
.. versionadded:: 2.6
The {pax_headers} argument is an optional dictionary of unicode strings which
will be added as a pax global header if {format} is PAX_FORMAT.
.. versionadded:: 2.6
TarFile.open(...)~
Alternative constructor. The tarfile.open function is actually a
shortcut to this classmethod.
TarFile.getmember(name)~
Return a TarInfo object for member {name}. If {name} can not be found
in the archive, KeyError is raised.
.. note:: >
If a member occurs more than once in the archive, its last occurrence is assumed
to be the most up-to-date version.
<
TarFile.getmembers()~
Return the members of the archive as a list of TarInfo objects. The
list has the same order as the members in the archive.
TarFile.getnames()~
Return the members as a list of their names. It has the same order as the list
returned by getmembers.
TarFile.list(verbose=True)~
Print a table of contents to ``sys.stdout``. If {verbose} is False,
only the names of the members are printed. If it is True, output
similar to that of ls -l is produced.
TarFile.next()~
Return the next member of the archive as a TarInfo object, when
TarFile is opened for reading. Return None if there is no more
available.
TarFile.extractall(path=".", members=None)~
Extract all members from the archive to the current working directory or
directory {path}. If optional {members} is given, it must be a subset of the
list returned by getmembers. Directory information like owner,
modification time and permissions are set after all members have been extracted.
This is done to work around two problems: A directory's modification time is
reset each time a file is created in it. And, if a directory's permissions do
not allow writing, extracting files to it will fail.
.. warning:: >
Never extract archives from untrusted sources without prior inspection.
It is possible that files are created outside of {path}, e.g. members
that have absolute filenames starting with ``"/"`` or filenames with two
dots ``".."``.
<
.. versionadded:: 2.5
TarFile.extract(member, path="")~
Extract a member from the archive to the current working directory, using its
full name. Its file information is extracted as accurately as possible. {member}
may be a filename or a TarInfo object. You can specify a different
directory using {path}.
.. note:: >
The extract method does not take care of several extraction issues.
In most cases you should consider using the extractall method.
<
.. warning::
See the warning for extractall.
TarFile.extractfile(member)~
Extract a member from the archive as a file object. {member} may be a filename
or a TarInfo object. If {member} is a regular file, a file-like object
is returned. If {member} is a link, a file-like object is constructed from the
link's target. If {member} is none of the above, None is returned.
.. note:: >
The file-like object is read-only. It provides the methods
read, readline (|py2stdlib-readline|), readlines, seek, tell,
and close, and also supports iteration over its lines.
<
TarFile.add(name, arcname=None, recursive=True, exclude=None, filter=None)~
Add the file {name} to the archive. {name} may be any type of file (directory,
fifo, symbolic link, etc.). If given, {arcname} specifies an alternative name
for the file in the archive. Directories are added recursively by default. This
can be avoided by setting {recursive} to False. If {exclude} is given
it must be a function that takes one filename argument and returns a boolean
value. Depending on this value the respective file is either excluded
(True) or added (False). If {filter} is specified it must
be a function that takes a TarInfo object argument and returns the
changed TarInfo object. If it instead returns None the TarInfo
object will be excluded from the archive. See tar-examples for an
example.
.. versionchanged:: 2.6
Added the {exclude} parameter.
.. versionchanged:: 2.7
Added the {filter} parameter.
2.7~
The {exclude} parameter is deprecated, please use the {filter} parameter
instead.
TarFile.addfile(tarinfo, fileobj=None)~
Add the TarInfo object {tarinfo} to the archive. If {fileobj} is given,
``tarinfo.size`` bytes are read from it and added to the archive. You can
create TarInfo objects using gettarinfo.
.. note:: >
On Windows platforms, {fileobj} should always be opened with mode ``'rb'`` to
avoid irritation about the file size.
<
TarFile.gettarinfo(name=None, arcname=None, fileobj=None)~
Create a TarInfo object for either the file {name} or the file object
{fileobj} (using os.fstat on its file descriptor). You can modify some
of the TarInfo's attributes before you add it using addfile.
If given, {arcname} specifies an alternative name for the file in the archive.
TarFile.close()~
Close the TarFile. In write mode, two finishing zero blocks are
appended to the archive.
TarFile.posix~
Setting this to True is equivalent to setting the format
attribute to USTAR_FORMAT, False is equivalent to
GNU_FORMAT.
.. versionchanged:: 2.4
{posix} defaults to False.
2.6~
Use the format attribute instead.
TarFile.pax_headers~
A dictionary containing key-value pairs of pax global headers.
.. versionadded:: 2.6
TarInfo Objects
---------------
A TarInfo object represents one member in a TarFile. Aside
from storing all required attributes of a file (like file type, size, time,
permissions, owner etc.), it provides some useful methods to determine its type.
It does {not} contain the file's data itself.
TarInfo objects are returned by TarFile's methods
getmember, getmembers and gettarinfo.
TarInfo(name="")~
Create a TarInfo object.
TarInfo.frombuf(buf)~
Create and return a TarInfo object from string buffer {buf}.
.. versionadded:: 2.6
Raises HeaderError if the buffer is invalid..
TarInfo.fromtarfile(tarfile)~
Read the next member from the TarFile object {tarfile} and return it as
a TarInfo object.
.. versionadded:: 2.6
TarInfo.tobuf(format=DEFAULT_FORMAT, encoding=ENCODING, errors='strict')~
Create a string buffer from a TarInfo object. For information on the
arguments see the constructor of the TarFile class.
.. versionchanged:: 2.6
The arguments were added.
A ``TarInfo`` object has the following public data attributes:
TarInfo.name~
Name of the archive member.
TarInfo.size~
Size in bytes.
TarInfo.mtime~
Time of last modification.
TarInfo.mode~
Permission bits.
TarInfo.type~
File type. {type} is usually one of these constants: REGTYPE,
AREGTYPE, LNKTYPE, SYMTYPE, DIRTYPE,
FIFOTYPE, CONTTYPE, CHRTYPE, BLKTYPE,
GNUTYPE_SPARSE. To determine the type of a TarInfo object
more conveniently, use the ``is_*()`` methods below.
TarInfo.linkname~
Name of the target file name, which is only present in TarInfo objects
of type LNKTYPE and SYMTYPE.
TarInfo.uid~
User ID of the user who originally stored this member.
TarInfo.gid~
Group ID of the user who originally stored this member.
TarInfo.uname~
User name.
TarInfo.gname~
Group name.
TarInfo.pax_headers~
A dictionary containing key-value pairs of an associated pax extended header.
.. versionadded:: 2.6
A TarInfo object also provides some convenient query methods:
TarInfo.isfile()~
Return True if the Tarinfo object is a regular file.
TarInfo.isreg()~
Same as isfile.
TarInfo.isdir()~
Return True if it is a directory.
TarInfo.issym()~
Return True if it is a symbolic link.
TarInfo.islnk()~
Return True if it is a hard link.
TarInfo.ischr()~
Return True if it is a character device.
TarInfo.isblk()~
Return True if it is a block device.
TarInfo.isfifo()~
Return True if it is a FIFO.
TarInfo.isdev()~
Return True if it is one of character device, block device or FIFO.
Examples
--------
How to extract an entire tar archive to the current working directory:: >
import tarfile
tar = tarfile.open("sample.tar.gz")
tar.extractall()
tar.close()
<
How to extract a subset of a tar archive with TarFile.extractall using
a generator function instead of a list:: >
import os
import tarfile
def py_files(members):
for tarinfo in members:
if os.path.splitext(tarinfo.name)[1] == ".py":
yield tarinfo
tar = tarfile.open("sample.tar.gz")
tar.extractall(members=py_files(tar))
tar.close()
<
How to create an uncompressed tar archive from a list of filenames::
import tarfile
tar = tarfile.open("sample.tar", "w")
for name in ["foo", "bar", "quux"]:
tar.add(name)
tar.close()
The same example using the with statement:: >
import tarfile
with tarfile.open("sample.tar", "w") as tar:
for name in ["foo", "bar", "quux"]:
tar.add(name)
<
How to read a gzip compressed tar archive and display some member information::
import tarfile
tar = tarfile.open("sample.tar.gz", "r:gz")
for tarinfo in tar:
print tarinfo.name, "is", tarinfo.size, "bytes in size and is",
if tarinfo.isreg():
print "a regular file."
elif tarinfo.isdir():
print "a directory."
else:
print "something else."
tar.close()
How to create an archive and reset the user information using the {filter}
parameter in TarFile.add:: >
import tarfile
def reset(tarinfo):
tarinfo.uid = tarinfo.gid = 0
tarinfo.uname = tarinfo.gname = "root"
return tarinfo
tar = tarfile.open("sample.tar.gz", "w:gz")
tar.add("foo", filter=reset)
tar.close()
<
Supported tar formats
There are three tar formats that can be created with the tarfile (|py2stdlib-tarfile|) module:
* The POSIX.1-1988 ustar format (USTAR_FORMAT). It supports filenames
up to a length of at best 256 characters and linknames up to 100 characters. The
maximum file size is 8 gigabytes. This is an old and limited but widely
supported format.
* The GNU tar format (GNU_FORMAT). It supports long filenames and
linknames, files bigger than 8 gigabytes and sparse files. It is the de facto
standard on GNU/Linux systems. tarfile (|py2stdlib-tarfile|) fully supports the GNU tar
extensions for long names, sparse file support is read-only.
* The POSIX.1-2001 pax format (PAX_FORMAT). It is the most flexible
format with virtually no limits. It supports long filenames and linknames, large
files and stores pathnames in a portable way. However, not all tar
implementations today are able to handle pax archives properly.
The {pax} format is an extension to the existing {ustar} format. It uses extra
headers for information that cannot be stored otherwise. There are two flavours
of pax headers: Extended headers only affect the subsequent file header, global
headers are valid for the complete archive and affect all following files. All
the data in a pax header is encoded in {UTF-8} for portability reasons.
There are some more variants of the tar format which can be read, but not
created:
* The ancient V7 format. This is the first tar format from Unix Seventh Edition,
storing only regular files and directories. Names must not be longer than 100
characters, there is no user/group name information. Some archives have
miscalculated header checksums in case of fields with non-ASCII characters.
* The SunOS tar extended format. This format is a variant of the POSIX.1-2001
pax format, but is not compatible.
Unicode issues
--------------
The tar format was originally conceived to make backups on tape drives with the
main focus on preserving file system information. Nowadays tar archives are
commonly used for file distribution and exchanging archives over networks. One
problem of the original format (that all other formats are merely variants of)
is that there is no concept of supporting different character encodings. For
example, an ordinary tar archive created on a {UTF-8} system cannot be read
correctly on a {Latin-1} system if it contains non-ASCII characters. Names (i.e.
filenames, linknames, user/group names) containing these characters will appear
damaged. Unfortunately, there is no way to autodetect the encoding of an
archive.
The pax format was designed to solve this problem. It stores non-ASCII names
using the universal character encoding {UTF-8}. When a pax archive is read,
these {UTF-8} names are converted to the encoding of the local file system.
The details of unicode conversion are controlled by the {encoding} and {errors}
keyword arguments of the TarFile class.
The default value for {encoding} is the local character encoding. It is deduced
from sys.getfilesystemencoding and sys.getdefaultencoding. In
read mode, {encoding} is used exclusively to convert unicode names from a pax
archive to strings in the local character encoding. In write mode, the use of
{encoding} depends on the chosen archive format. In case of PAX_FORMAT,
input names that contain non-ASCII characters need to be decoded before being
stored as {UTF-8} strings. The other formats do not make use of {encoding}
unless unicode objects are used as input names. These are converted to 8-bit
character strings before they are added to the archive.
The {errors} argument defines how characters are treated that cannot be
converted to or from {encoding}. Possible values are listed in section
codec-base-classes. In read mode, there is an additional scheme
``'utf-8'`` which means that bad characters are replaced by their {UTF-8}
representation. This is the default scheme. In write mode the default value for
{errors} is ``'strict'`` to ensure that name information is not altered
unnoticed.
==============================================================================
*py2stdlib-telnetlib*
telnetlib~
:synopsis: Telnet client class.
.. index:: single: protocol; Telnet
The telnetlib (|py2stdlib-telnetlib|) module provides a Telnet class that implements the
Telnet protocol. See 854 for details about the protocol. In addition, it
provides symbolic constants for the protocol characters (see below), and for the
telnet options. The symbolic names of the telnet options follow the definitions
in ``arpa/telnet.h``, with the leading ``TELOPT_`` removed. For symbolic names
of options which are traditionally not included in ``arpa/telnet.h``, see the
module source itself.
The symbolic constants for the telnet commands are: IAC, DONT, DO, WONT, WILL,
SE (Subnegotiation End), NOP (No Operation), DM (Data Mark), BRK (Break), IP
(Interrupt process), AO (Abort output), AYT (Are You There), EC (Erase
Character), EL (Erase Line), GA (Go Ahead), SB (Subnegotiation Begin).
Telnet([host[, port[, timeout]]])~
Telnet represents a connection to a Telnet server. The instance is
initially not connected by default; the open method must be used to
establish a connection. Alternatively, the host name and optional port
number can be passed to the constructor, to, in which case the connection to
the server will be established before the constructor returns. The optional
{timeout} parameter specifies a timeout in seconds for blocking operations
like the connection attempt (if not specified, the global default timeout
setting will be used).
Do not reopen an already connected instance.
This class has many read_\* methods. Note that some of them raise
EOFError when the end of the connection is read, because they can return
an empty string for other reasons. See the individual descriptions below.
.. versionchanged:: 2.6
{timeout} was added.
.. seealso::
854 - Telnet Protocol Specification
Definition of the Telnet protocol.
Telnet Objects
--------------
Telnet instances have the following methods:
Telnet.read_until(expected[, timeout])~
Read until a given string, {expected}, is encountered or until {timeout} seconds
have passed.
When no match is found, return whatever is available instead, possibly the empty
string. Raise EOFError if the connection is closed and no cooked data is
available.
Telnet.read_all()~
Read all data until EOF; block until connection closed.
Telnet.read_some()~
Read at least one byte of cooked data unless EOF is hit. Return ``''`` if EOF is
hit. Block if no data is immediately available.
Telnet.read_very_eager()~
Read everything that can be without blocking in I/O (eager).
Raise EOFError if connection closed and no cooked data available. Return
``''`` if no cooked data available otherwise. Do not block unless in the midst
of an IAC sequence.
Telnet.read_eager()~
Read readily available data.
Raise EOFError if connection closed and no cooked data available. Return
``''`` if no cooked data available otherwise. Do not block unless in the midst
of an IAC sequence.
Telnet.read_lazy()~
Process and return data already in the queues (lazy).
Raise EOFError if connection closed and no data available. Return ``''``
if no cooked data available otherwise. Do not block unless in the midst of an
IAC sequence.
Telnet.read_very_lazy()~
Return any data available in the cooked queue (very lazy).
Raise EOFError if connection closed and no data available. Return ``''``
if no cooked data available otherwise. This method never blocks.
Telnet.read_sb_data()~
Return the data collected between a SB/SE pair (suboption begin/end). The
callback should access these data when it was invoked with a ``SE`` command.
This method never blocks.
.. versionadded:: 2.3
Telnet.open(host[, port[, timeout]])~
Connect to a host. The optional second argument is the port number, which
defaults to the standard Telnet port (23). The optional {timeout} parameter
specifies a timeout in seconds for blocking operations like the connection
attempt (if not specified, the global default timeout setting will be used).
Do not try to reopen an already connected instance.
.. versionchanged:: 2.6
{timeout} was added.
Telnet.msg(msg[, *args])~
Print a debug message when the debug level is ``>`` 0. If extra arguments are
present, they are substituted in the message using the standard string
formatting operator.
Telnet.set_debuglevel(debuglevel)~
Set the debug level. The higher the value of {debuglevel}, the more debug
output you get (on ``sys.stdout``).
Telnet.close()~
Close the connection.
Telnet.get_socket()~
Return the socket object used internally.
Telnet.fileno()~
Return the file descriptor of the socket object used internally.
Telnet.write(buffer)~
Write a string to the socket, doubling any IAC characters. This can block if the
connection is blocked. May raise socket.error if the connection is
closed.
Telnet.interact()~
Interaction function, emulates a very dumb Telnet client.
Telnet.mt_interact()~
Multithreaded version of interact.
Telnet.expect(list[, timeout])~
Read until one from a list of a regular expressions matches.
The first argument is a list of regular expressions, either compiled
(re.RegexObject instances) or uncompiled (strings). The optional second
argument is a timeout, in seconds; the default is to block indefinitely.
Return a tuple of three items: the index in the list of the first regular
expression that matches; the match object returned; and the text read up till
and including the match.
If end of file is found and no text was read, raise EOFError. Otherwise,
when nothing matches, return ``(-1, None, text)`` where {text} is the text
received so far (may be the empty string if a timeout happened).
If a regular expression ends with a greedy match (such as ``.*``) or if more
than one expression can match the same input, the results are indeterministic,
and may depend on the I/O timing.
Telnet.set_option_negotiation_callback(callback)~
Each time a telnet option is read on the input flow, this {callback} (if set) is
called with the following parameters : callback(telnet socket, command
(DO/DONT/WILL/WONT), option). No other action is done afterwards by telnetlib.
Telnet Example
--------------
A simple example illustrating typical use:: >
import getpass
import sys
import telnetlib
HOST = "localhost"
user = raw_input("Enter your remote account: ")
password = getpass.getpass()
tn = telnetlib.Telnet(HOST)
tn.read_until("login: ")
tn.write(user + "\n")
if password:
tn.read_until("Password: ")
tn.write(password + "\n")
tn.write("ls\n")
tn.write("exit\n")
print tn.read_all()
==============================================================================
*py2stdlib-tempfile*
tempfile~
:synopsis: Generate temporary files and directories.
.. index::
pair: temporary; file name
pair: temporary; file
This module generates temporary files and directories. It works on all
supported platforms.
In version 2.3 of Python, this module was overhauled for enhanced security. It
now provides three new functions, NamedTemporaryFile, mkstemp,
and mkdtemp, which should eliminate all remaining need to use the
insecure mktemp function. Temporary file names created by this module
no longer contain the process ID; instead a string of six random characters is
used.
Also, all the user-callable functions now take additional arguments which
allow direct control over the location and name of temporary files. It is
no longer necessary to use the global {tempdir} and {template} variables.
To maintain backward compatibility, the argument order is somewhat odd; it
is recommended to use keyword arguments for clarity.
The module defines the following user-callable functions:
TemporaryFile([mode='w+b'[, bufsize=-1[, suffix=''[, prefix='tmp'[, dir=None]]]]])~
Return a file-like object that can be used as a temporary storage area.
The file is created using mkstemp. It will be destroyed as soon
as it is closed (including an implicit close when the object is garbage
collected). Under Unix, the directory entry for the file is removed
immediately after the file is created. Other platforms do not support
this; your code should not rely on a temporary file created using this
function having or not having a visible name in the file system.
The {mode} parameter defaults to ``'w+b'`` so that the file created can
be read and written without being closed. Binary mode is used so that it
behaves consistently on all platforms without regard for the data that is
stored. {bufsize} defaults to ``-1``, meaning that the operating system
default is used.
The {dir}, {prefix} and {suffix} parameters are passed to mkstemp.
The returned object is a true file object on POSIX platforms. On other
platforms, it is a file-like object whose !file attribute is the
underlying true file object. This file-like object can be used in a
with statement, just like a normal file.
NamedTemporaryFile([mode='w+b'[, bufsize=-1[, suffix=''[, prefix='tmp'[, dir=None[, delete=True]]]]]])~
This function operates exactly as TemporaryFile does, except that
the file is guaranteed to have a visible name in the file system (on
Unix, the directory entry is not unlinked). That name can be retrieved
from the name member of the file object. Whether the name can be
used to open the file a second time, while the named temporary file is
still open, varies across platforms (it can be so used on Unix; it cannot
on Windows NT or later). If {delete} is true (the default), the file is
deleted as soon as it is closed.
The returned object is always a file-like object whose !file
attribute is the underlying true file object. This file-like object can
be used in a with statement, just like a normal file.
.. versionadded:: 2.3
.. versionadded:: 2.6
The {delete} parameter.
SpooledTemporaryFile([max_size=0, [mode='w+b'[, bufsize=-1[, suffix=''[, prefix='tmp'[, dir=None]]]]]])~
This function operates exactly as TemporaryFile does, except that
data is spooled in memory until the file size exceeds {max_size}, or
until the file's fileno method is called, at which point the
contents are written to disk and operation proceeds as with
TemporaryFile.
The resulting file has one additional method, rollover, which
causes the file to roll over to an on-disk file regardless of its size.
The returned object is a file-like object whose _file attribute
is either a StringIO (|py2stdlib-stringio|) object or a true file object, depending on
whether rollover has been called. This file-like object can be
used in a with statement, just like a normal file.
.. versionadded:: 2.6
mkstemp([suffix=''[, prefix='tmp'[, dir=None[, text=False]]]])~
Creates a temporary file in the most secure manner possible. There are
no race conditions in the file's creation, assuming that the platform
properly implements the os.O_EXCL flag for os.open. The
file is readable and writable only by the creating user ID. If the
platform uses permission bits to indicate whether a file is executable,
the file is executable by no one. The file descriptor is not inherited
by child processes.
Unlike TemporaryFile, the user of mkstemp is responsible
for deleting the temporary file when done with it.
If {suffix} is specified, the file name will end with that suffix,
otherwise there will be no suffix. mkstemp does not put a dot
between the file name and the suffix; if you need one, put it at the
beginning of {suffix}.
If {prefix} is specified, the file name will begin with that prefix;
otherwise, a default prefix is used.
If {dir} is specified, the file will be created in that directory;
otherwise, a default directory is used. The default directory is chosen
from a platform-dependent list, but the user of the application can
control the directory location by setting the {TMPDIR}, {TEMP} or {TMP}
environment variables. There is thus no guarantee that the generated
filename will have any nice properties, such as not requiring quoting
when passed to external commands via ``os.popen()``.
If {text} is specified, it indicates whether to open the file in binary
mode (the default) or text mode. On some platforms, this makes no
difference.
mkstemp returns a tuple containing an OS-level handle to an open
file (as would be returned by os.open) and the absolute pathname
of that file, in that order.
.. versionadded:: 2.3
mkdtemp([suffix=''[, prefix='tmp'[, dir=None]]])~
Creates a temporary directory in the most secure manner possible. There
are no race conditions in the directory's creation. The directory is
readable, writable, and searchable only by the creating user ID.
The user of mkdtemp is responsible for deleting the temporary
directory and its contents when done with it.
The {prefix}, {suffix}, and {dir} arguments are the same as for
mkstemp.
mkdtemp returns the absolute pathname of the new directory.
.. versionadded:: 2.3
mktemp([suffix=''[, prefix='tmp'[, dir=None]]])~
2.3~
Use mkstemp instead.
Return an absolute pathname of a file that did not exist at the time the
call is made. The {prefix}, {suffix}, and {dir} arguments are the same
as for mkstemp.
.. warning:: >
Use of this function may introduce a security hole in your program. By
the time you get around to doing anything with the file name it returns,
someone else may have beaten you to the punch. mktemp usage can
be replaced easily with NamedTemporaryFile, passing it the
``delete=False`` parameter::
>>> f = NamedTemporaryFile(delete=False)
>>> f
<open file '<fdopen>', mode 'w+b' at 0x384698>
>>> f.name
'/var/folders/5q/5qTPn6xq2RaWqk+1Ytw3-U+++TI/-Tmp-/tmpG7V1Y0'
>>> f.write("Hello World!\n")
>>> f.close()
>>> os.unlink(f.name)
>>> os.path.exists(f.name)
False
<
The module uses two global variables that tell it how to construct a
temporary name. They are initialized at the first call to any of the
functions above. The caller may change them, but this is discouraged; use
the appropriate function arguments, instead.
tempdir~
When set to a value other than ``None``, this variable defines the
default value for the {dir} argument to all the functions defined in this
module.
If ``tempdir`` is unset or ``None`` at any call to any of the above
functions, Python searches a standard list of directories and sets
{tempdir} to the first one which the calling user can create files in.
The list is:
#. The directory named by the TMPDIR environment variable.
#. The directory named by the TEMP environment variable.
#. The directory named by the TMP environment variable.
#. A platform-specific location:
* On RiscOS, the directory named by the Wimp$ScrapDir environment
variable.
* On Windows, the directories C:\\TEMP, C:\\TMP,
\\TEMP, and \\TMP, in that order.
* On all other platforms, the directories /tmp, /var/tmp, and
/usr/tmp, in that order.
#. As a last resort, the current working directory.
gettempdir()~
Return the directory currently selected to create temporary files in. If
tempdir is not ``None``, this simply returns its contents; otherwise,
the search described above is performed, and the result returned.
.. versionadded:: 2.3
template~
2.0~
Use gettempprefix instead.
When set to a value other than ``None``, this variable defines the prefix of the
final component of the filenames returned by mktemp. A string of six
random letters and digits is appended to the prefix to make the filename unique.
The default prefix is tmp.
Older versions of this module used to require that ``template`` be set to
``None`` after a call to os.fork; this has not been necessary since
version 1.5.2.
gettempprefix()~
Return the filename prefix used to create temporary files. This does not
contain the directory component. Using this function is preferred over reading
the {template} variable directly.
.. versionadded:: 1.5.2
==============================================================================
*py2stdlib-termios*
termios~
:platform: Unix
:synopsis: POSIX style tty control.
.. index::
pair: POSIX; I/O control
pair: tty; I/O control
This module provides an interface to the POSIX calls for tty I/O control. For a
complete description of these calls, see the POSIX or Unix manual pages. It is
only available for those Unix versions that support POSIX {termios} style tty
I/O control (and then only if configured at installation time).
All functions in this module take a file descriptor {fd} as their first
argument. This can be an integer file descriptor, such as returned by
``sys.stdin.fileno()``, or a file object, such as ``sys.stdin`` itself.
This module also defines all the constants needed to work with the functions
provided here; these have the same name as their counterparts in C. Please
refer to your system documentation for more information on using these terminal
control interfaces.
The module defines the following functions:
tcgetattr(fd)~
Return a list containing the tty attributes for file descriptor {fd}, as
follows: ``[iflag, oflag, cflag, lflag, ispeed, ospeed, cc]`` where {cc} is a
list of the tty special characters (each a string of length 1, except the
items with indices VMIN and VTIME, which are integers when
these fields are defined). The interpretation of the flags and the speeds as
well as the indexing in the {cc} array must be done using the symbolic
constants defined in the termios (|py2stdlib-termios|) module.
tcsetattr(fd, when, attributes)~
Set the tty attributes for file descriptor {fd} from the {attributes}, which is
a list like the one returned by tcgetattr. The {when} argument
determines when the attributes are changed: TCSANOW to change
immediately, TCSADRAIN to change after transmitting all queued output,
or TCSAFLUSH to change after transmitting all queued output and
discarding all queued input.
tcsendbreak(fd, duration)~
Send a break on file descriptor {fd}. A zero {duration} sends a break for 0.25
--0.5 seconds; a nonzero {duration} has a system dependent meaning.
tcdrain(fd)~
Wait until all output written to file descriptor {fd} has been transmitted.
tcflush(fd, queue)~
Discard queued data on file descriptor {fd}. The {queue} selector specifies
which queue: TCIFLUSH for the input queue, TCOFLUSH for the
output queue, or TCIOFLUSH for both queues.
tcflow(fd, action)~
Suspend or resume input or output on file descriptor {fd}. The {action}
argument can be TCOOFF to suspend output, TCOON to restart
output, TCIOFF to suspend input, or TCION to restart input.
.. seealso::
Module tty (|py2stdlib-tty|)
Convenience functions for common terminal control operations.
Example
-------
Here's a function that prompts for a password with echoing turned off. Note the
technique using a separate tcgetattr call and a try ...
finally statement to ensure that the old tty attributes are restored
exactly no matter what happens:: >
def getpass(prompt="Password: "):
import termios, sys
fd = sys.stdin.fileno()
old = termios.tcgetattr(fd)
new = termios.tcgetattr(fd)
new[3] = new[3] & ~termios.ECHO # lflags
try:
termios.tcsetattr(fd, termios.TCSADRAIN, new)
passwd = raw_input(prompt)
finally:
termios.tcsetattr(fd, termios.TCSADRAIN, old)
return passwd
==============================================================================
*py2stdlib-test*
test~
:synopsis: Regression tests package containing the testing suite for Python.
The test (|py2stdlib-test|) package contains all regression tests for Python as well as the
modules test.test_support (|py2stdlib-test.test_support|) and test.regrtest.
test.test_support (|py2stdlib-test.test_support|) is used to enhance your tests while
test.regrtest drives the testing suite.
Each module in the test (|py2stdlib-test|) package whose name starts with ``test_`` is a
testing suite for a specific module or feature. All new tests should be written
using the unittest (|py2stdlib-unittest|) or doctest (|py2stdlib-doctest|) module. Some older tests are
written using a "traditional" testing style that compares output printed to
``sys.stdout``; this style of test is considered deprecated.
.. seealso::
Module unittest (|py2stdlib-unittest|)
Writing PyUnit regression tests.
Module doctest (|py2stdlib-doctest|)
Tests embedded in documentation strings.
Writing Unit Tests for the test (|py2stdlib-test|) package
----------------------------------------------
It is preferred that tests that use the unittest (|py2stdlib-unittest|) module follow a few
guidelines. One is to name the test module by starting it with ``test_`` and end
it with the name of the module being tested. The test methods in the test module
should start with ``test_`` and end with a description of what the method is
testing. This is needed so that the methods are recognized by the test driver as
test methods. Also, no documentation string for the method should be included. A
comment (such as ``# Tests function returns only True or False``) should be used
to provide documentation for test methods. This is done because documentation
strings get printed out if they exist and thus what test is being run is not
stated.
A basic boilerplate is often used:: >
import unittest
from test import test_support
class MyTestCase1(unittest.TestCase):
# Only use setUp() and tearDown() if necessary
def setUp(self):
... code to execute in preparation for tests ...
def tearDown(self):
... code to execute to clean up after tests ...
def test_feature_one(self):
# Test feature one.
... testing code ...
def test_feature_two(self):
# Test feature two.
... testing code ...
... more test methods ...
class MyTestCase2(unittest.TestCase):
... same structure as MyTestCase1 ...
... more test classes ...
def test_main():
test_support.run_unittest(MyTestCase1,
MyTestCase2,
... list other tests ...
)
if __name__ == '__main__':
test_main()
<
This boilerplate code allows the testing suite to be run by test.regrtest
as well as on its own as a script.
The goal for regression testing is to try to break code. This leads to a few
guidelines to be followed:
* The testing suite should exercise all classes, functions, and constants. This
includes not just the external API that is to be presented to the outside
world but also "private" code.
* Whitebox testing (examining the code being tested when the tests are being
written) is preferred. Blackbox testing (testing only the published user
interface) is not complete enough to make sure all boundary and edge cases
are tested.
* Make sure all possible values are tested including invalid ones. This makes
sure that not only all valid values are acceptable but also that improper
values are handled correctly.
* Exhaust as many code paths as possible. Test where branching occurs and thus
tailor input to make sure as many different paths through the code are taken.
* Add an explicit test for any bugs discovered for the tested code. This will
make sure that the error does not crop up again if the code is changed in the
future.
* Make sure to clean up after your tests (such as close and remove all temporary
files).
* If a test is dependent on a specific condition of the operating system then
verify the condition already exists before attempting the test.
* Import as few modules as possible and do it as soon as possible. This
minimizes external dependencies of tests and also minimizes possible anomalous
behavior from side-effects of importing a module.
* Try to maximize code reuse. On occasion, tests will vary by something as small
as what type of input is used. Minimize code duplication by subclassing a
basic test class with a class that specifies the input:: >
class TestFuncAcceptsSequences(unittest.TestCase):
func = mySuperWhammyFunction
def test_func(self):
self.func(self.arg)
class AcceptLists(TestFuncAcceptsSequences):
arg = [1, 2, 3]
class AcceptStrings(TestFuncAcceptsSequences):
arg = 'abc'
class AcceptTuples(TestFuncAcceptsSequences):
arg = (1, 2, 3)
<
.. seealso::
Test Driven Development
A book by Kent Beck on writing tests before code.
Running tests using test.regrtest
----------------------------------------
test.regrtest can be used as a script to drive Python's regression test
suite. Running the script by itself automatically starts running all regression
tests in the test (|py2stdlib-test|) package. It does this by finding all modules in the
package whose name starts with ``test_``, importing them, and executing the
function test_main if present. The names of tests to execute may also
be passed to the script. Specifying a single regression test (:program:`python
regrtest.py` test_spam.py) will minimize output and only print
whether the test passed or failed and thus minimize output.
Running test.regrtest directly allows what resources are available for
tests to use to be set. You do this by using the -u command-line
option. Run python regrtest.py -uall to turn on all
resources; specifying all as an option for -u enables all
possible resources. If all but one resource is desired (a more common case), a
comma-separated list of resources that are not desired may be listed after
all. The command python regrtest.py
-uall,-audio,-largefile will run test.regrtest with all
resources except the audio and largefile resources. For a
list of all resources and more command-line options, run :program:`python
regrtest.py` -h.
Some other ways to execute the regression tests depend on what platform the
tests are being executed on. On Unix, you can run make
test (|py2stdlib-test|) at the top-level directory where Python was built. On Windows,
executing rt.bat from your PCBuild directory will run all
regression tests.
test.test_support (|py2stdlib-test.test_support|) --- Utility functions for tests
========================================================
==============================================================================
*py2stdlib-test.test_support*
test.test_support~
:synopsis: Support for Python regression tests.
.. note::
The test.test_support (|py2stdlib-test.test_support|) module has been renamed to test.support
in Python 3.x.
The test.test_support (|py2stdlib-test.test_support|) module provides support for Python's regression
tests.
This module defines the following exceptions:
TestFailed~
Exception to be raised when a test fails. This is deprecated in favor of
unittest (|py2stdlib-unittest|)\ -based tests and unittest.TestCase's assertion
methods.
ResourceDenied~
Subclass of unittest.SkipTest. Raised when a resource (such as a
network connection) is not available. Raised by the requires
function.
The test.test_support (|py2stdlib-test.test_support|) module defines the following constants:
verbose~
True when verbose output is enabled. Should be checked when more
detailed information is desired about a running test. {verbose} is set by
test.regrtest.
have_unicode~
True when Unicode support is available.
is_jython~
True if the running interpreter is Jython.
TESTFN~
Set to a name that is safe to use as the name of a temporary file. Any
temporary file that is created should be closed and unlinked (removed).
The test.test_support (|py2stdlib-test.test_support|) module defines the following functions:
forget(module_name)~
Remove the module named {module_name} from ``sys.modules`` and delete any
byte-compiled files of the module.
is_resource_enabled(resource)~
Return True if {resource} is enabled and available. The list of
available resources is only set when test.regrtest is executing the
tests.
requires(resource[, msg])~
Raise ResourceDenied if {resource} is not available. {msg} is the
argument to ResourceDenied if it is raised. Always returns
True if called by a function whose ``__name__`` is ``'__main__'``.
Used when tests are executed by test.regrtest.
findfile(filename)~
Return the path to the file named {filename}. If no match is found
{filename} is returned. This does not equal a failure since it could be the
path to the file.
run_unittest(*classes)~
Execute unittest.TestCase subclasses passed to the function. The
function scans the classes for methods starting with the prefix ``test_``
and executes the tests individually.
It is also legal to pass strings as parameters; these should be keys in
``sys.modules``. Each associated module will be scanned by
``unittest.TestLoader.loadTestsFromModule()``. This is usually seen in the
following test_main function:: >
def test_main():
test_support.run_unittest(__name__)
<
This will run all tests defined in the named module.
check_warnings(*filters, quiet=True)~
A convenience wrapper for warnings.catch_warnings() that makes it
easier to test that a warning was correctly raised. It is approximately
equivalent to calling ``warnings.catch_warnings(record=True)`` with
warnings.simplefilter set to ``always`` and with the option to
automatically validate the results that are recorded.
``check_warnings`` accepts 2-tuples of the form ``("message regexp",
WarningCategory)`` as positional arguments. If one or more {filters} are
provided, or if the optional keyword argument {quiet} is False,
it checks to make sure the warnings are as expected: each specified filter
must match at least one of the warnings raised by the enclosed code or the
test fails, and if any warnings are raised that do not match any of the
specified filters the test fails. To disable the first of these checks,
set {quiet} to True.
If no arguments are specified, it defaults to:: >
check_warnings(("", Warning), quiet=True)
<
In this case all warnings are caught and no errors are raised.
On entry to the context manager, a WarningRecorder instance is
returned. The underlying warnings list from
warnings.catch_warnings is available via the recorder object's
warnings (|py2stdlib-warnings|) attribute. As a convenience, the attributes of the object
representing the most recent warning can also be accessed directly through
the recorder object (see example below). If no warning has been raised,
then any of the attributes that would otherwise be expected on an object
representing a warning will return None.
The recorder object also has a reset method, which clears the
warnings list.
The context manager is designed to be used like this:: >
with check_warnings(("assertion is always true", SyntaxWarning),
("", UserWarning)):
exec('assert(False, "Hey!")')
warnings.warn(UserWarning("Hide me!"))
<
In this case if either warning was not raised, or some other warning was
raised, check_warnings would raise an error.
When a test needs to look more deeply into the warnings, rather than
just checking whether or not they occurred, code like this can be used:: >
with check_warnings(quiet=True) as w:
warnings.warn("foo")
assert str(w.args[0]) == "foo"
warnings.warn("bar")
assert str(w.args[0]) == "bar"
assert str(w.warnings[0].args[0]) == "foo"
assert str(w.warnings[1].args[0]) == "bar"
w.reset()
assert len(w.warnings) == 0
<
Here all warnings will be caught, and the test code tests the captured
warnings directly.
.. versionadded:: 2.6
.. versionchanged:: 2.7
New optional arguments {filters} and {quiet}.
check_py3k_warnings(*filters, quiet=False)~
Similar to check_warnings, but for Python 3 compatibility warnings.
If ``sys.py3kwarning == 1``, it checks if the warning is effectively raised.
If ``sys.py3kwarning == 0``, it checks that no warning is raised. It
accepts 2-tuples of the form ``("message regexp", WarningCategory)`` as
positional arguments. When the optional keyword argument {quiet} is
True, it does not fail if a filter catches nothing. Without
arguments, it defaults to:: >
check_py3k_warnings(("", DeprecationWarning), quiet=False)
<
.. versionadded:: 2.7
captured_stdout()~
This is a context manager that runs the with statement body using
a StringIO.StringIO object as sys.stdout. That object can be
retrieved using the ``as`` clause of the with statement.
Example use:: >
with captured_stdout() as s:
print "hello"
assert s.getvalue() == "hello"
<
.. versionadded:: 2.6
import_module(name, deprecated=False)~
This function imports and returns the named module. Unlike a normal
import, this function raises unittest.SkipTest if the module
cannot be imported.
Module and package deprecation messages are suppressed during this import
if {deprecated} is True.
.. versionadded:: 2.7
import_fresh_module(name, fresh=(), blocked=(), deprecated=False)~
This function imports and returns a fresh copy of the named Python module
by removing the named module from ``sys.modules`` before doing the import.
Note that unlike reload, the original module is not affected by
this operation.
{fresh} is an iterable of additional module names that are also removed
from the ``sys.modules`` cache before doing the import.
{blocked} is an iterable of module names that are replaced with 0
in the module cache during the import to ensure that attempts to import
them raise ImportError.
The named module and any modules named in the {fresh} and {blocked}
parameters are saved before starting the import and then reinserted into
``sys.modules`` when the fresh import is complete.
Module and package deprecation messages are suppressed during this import
if {deprecated} is True.
This function will raise unittest.SkipTest is the named module
cannot be imported.
Example use:: >
# Get copies of the warnings module for testing without
# affecting the version being used by the rest of the test suite
# One copy uses the C implementation, the other is forced to use
# the pure Python fallback implementation
py_warnings = import_fresh_module('warnings', blocked=['_warnings'])
c_warnings = import_fresh_module('warnings', fresh=['_warnings'])
<
.. versionadded:: 2.7
The test.test_support (|py2stdlib-test.test_support|) module defines the following classes:
TransientResource(exc[, {}kwargs])~
Instances are a context manager that raises ResourceDenied if the
specified exception type is raised. Any keyword arguments are treated as
attribute/value pairs to be compared against any exception raised within the
with statement. Only if all pairs match properly against
attributes on the exception is ResourceDenied raised.
.. versionadded:: 2.6
EnvironmentVarGuard()~
Class used to temporarily set or unset environment variables. Instances can
be used as a context manager and have a complete dictionary interface for
querying/modifying the underlying ``os.environ``. After exit from the
context manager all changes to environment variables done through this
instance will be rolled back.
.. versionadded:: 2.6
.. versionchanged:: 2.7
Added dictionary interface.
EnvironmentVarGuard.set(envvar, value)~
Temporarily set the environment variable ``envvar`` to the value of
``value``.
EnvironmentVarGuard.unset(envvar)~
Temporarily unset the environment variable ``envvar``.
WarningsRecorder()~
Class used to record warnings for unit tests. See documentation of
check_warnings above for more details.
.. versionadded:: 2.6
==============================================================================
*py2stdlib-textwrap*
textwrap~
:synopsis: Text wrapping and filling
.. versionadded:: 2.3
The textwrap (|py2stdlib-textwrap|) module provides two convenience functions, wrap and
fill, as well as TextWrapper, the class that does all the work,
and a utility function dedent. If you're just wrapping or filling one
or two text strings, the convenience functions should be good enough;
otherwise, you should use an instance of TextWrapper for efficiency.
wrap(text[, width[, ...]])~
Wraps the single paragraph in {text} (a string) so every line is at most {width}
characters long. Returns a list of output lines, without final newlines.
Optional keyword arguments correspond to the instance attributes of
TextWrapper, documented below. {width} defaults to ``70``.
fill(text[, width[, ...]])~
Wraps the single paragraph in {text}, and returns a single string containing the
wrapped paragraph. fill is shorthand for :: >
"\n".join(wrap(text, ...))
<
In particular, fill accepts exactly the same keyword arguments as
wrap.
Both wrap and fill work by creating a TextWrapper
instance and calling a single method on it. That instance is not reused, so for
applications that wrap/fill many text strings, it will be more efficient for you
to create your own TextWrapper object.
Text is preferably wrapped on whitespaces and right after the hyphens in
hyphenated words; only then will long words be broken if necessary, unless
TextWrapper.break_long_words is set to false.
An additional utility function, dedent, is provided to remove
indentation from strings that have unwanted whitespace to the left of the text.
dedent(text)~
Remove any common leading whitespace from every line in {text}.
This can be used to make triple-quoted strings line up with the left edge of the
display, while still presenting them in the source code in indented form.
Note that tabs and spaces are both treated as whitespace, but they are not
equal: the lines ``" hello"`` and ``"\thello"`` are considered to have no
common leading whitespace. (This behaviour is new in Python 2.5; older versions
of this module incorrectly expanded tabs before searching for common leading
whitespace.)
For example:: >
def test():
# end first line with \ to avoid the empty line!
s = '''\
hello
world
'''
print repr(s) # prints ' hello\n world\n '
print repr(dedent(s)) # prints 'hello\n world\n'
<
TextWrapper(...)~
The TextWrapper constructor accepts a number of optional keyword
arguments. Each argument corresponds to one instance attribute, so for example
:: >
wrapper = TextWrapper(initial_indent="* ")
<
is the same as ::
wrapper = TextWrapper()
wrapper.initial_indent = "* "
You can re-use the same TextWrapper object many times, and you can
change any of its options through direct assignment to instance attributes
between uses.
The TextWrapper instance attributes (and keyword arguments to the
constructor) are as follows:
width~
(default: ``70``) The maximum length of wrapped lines. As long as there
are no individual words in the input text longer than width,
TextWrapper guarantees that no output line will be longer than
width characters.
expand_tabs~
(default: ``True``) If true, then all tab characters in {text} will be
expanded to spaces using the expandtabs method of {text}.
replace_whitespace~
(default: ``True``) If true, each whitespace character (as defined by
``string.whitespace``) remaining after tab expansion will be replaced by a
single space.
.. note:: >
If expand_tabs is false and replace_whitespace is true,
each tab character will be replaced by a single space, which is {not}
the same as tab expansion.
<
drop_whitespace~
(default: ``True``) If true, whitespace that, after wrapping, happens to
end up at the beginning or end of a line is dropped (leading whitespace in
the first line is always preserved, though).
.. versionadded:: 2.6
Whitespace was always dropped in earlier versions.
initial_indent~
(default: ``''``) String that will be prepended to the first line of
wrapped output. Counts towards the length of the first line.
subsequent_indent~
(default: ``''``) String that will be prepended to all lines of wrapped
output except the first. Counts towards the length of each line except
the first.
fix_sentence_endings~
(default: ``False``) If true, TextWrapper attempts to detect
sentence endings and ensure that sentences are always separated by exactly
two spaces. This is generally desired for text in a monospaced font.
However, the sentence detection algorithm is imperfect: it assumes that a
sentence ending consists of a lowercase letter followed by one of ``'.'``,
``'!'``, or ``'?'``, possibly followed by one of ``'"'`` or ``"'"``,
followed by a space. One problem with this is algorithm is that it is
unable to detect the difference between "Dr." in :: >
[...] Dr. Frankenstein's monster [...]
<
and "Spot." in ::
[...] See Spot. See Spot run [...]
fix_sentence_endings is false by default.
Since the sentence detection algorithm relies on ``string.lowercase`` for
the definition of "lowercase letter," and a convention of using two spaces
after a period to separate sentences on the same line, it is specific to
English-language texts.
break_long_words~
(default: ``True``) If true, then words longer than width will be
broken in order to ensure that no lines are longer than width. If
it is false, long words will not be broken, and some lines may be longer
than width. (Long words will be put on a line by themselves, in
order to minimize the amount by which width is exceeded.)
break_on_hyphens~
(default: ``True``) If true, wrapping will occur preferably on whitespaces
and right after hyphens in compound words, as it is customary in English.
If false, only whitespaces will be considered as potentially good places
for line breaks, but you need to set break_long_words to false if
you want truly insecable words. Default behaviour in previous versions
was to always allow breaking hyphenated words.
.. versionadded:: 2.6
TextWrapper also provides two public methods, analogous to the
module-level convenience functions:
wrap(text)~
Wraps the single paragraph in {text} (a string) so every line is at most
width characters long. All wrapping options are taken from
instance attributes of the TextWrapper instance. Returns a list
of output lines, without final newlines.
fill(text)~
Wraps the single paragraph in {text}, and returns a single string
containing the wrapped paragraph.
==============================================================================
*py2stdlib-thread*
thread~
:synopsis: Create multiple threads of control within one interpreter.
.. note::
The thread (|py2stdlib-thread|) module has been renamed to _thread in Python 3.0.
The 2to3 tool will automatically adapt imports when converting your
sources to 3.0; however, you should consider using the high-level
threading (|py2stdlib-threading|) module instead.
.. index::
single: light-weight processes
single: processes, light-weight
single: binary semaphores
single: semaphores, binary
This module provides low-level primitives for working with multiple threads
(also called light-weight processes or tasks) --- multiple threads of
control sharing their global data space. For synchronization, simple locks
(also called mutexes or binary semaphores) are provided.
The threading (|py2stdlib-threading|) module provides an easier to use and higher-level
threading API built on top of this module.
.. index::
single: pthreads
pair: threads; POSIX
The module is optional. It is supported on Windows, Linux, SGI IRIX, Solaris
2.x, as well as on systems that have a POSIX thread (a.k.a. "pthread")
implementation. For systems lacking the thread (|py2stdlib-thread|) module, the
dummy_thread (|py2stdlib-dummy_thread|) module is available. It duplicates this module's interface
and can be used as a drop-in replacement.
It defines the following constant and functions:
error~
Raised on thread-specific errors.
LockType~
This is the type of lock objects.
start_new_thread(function, args[, kwargs])~
Start a new thread and return its identifier. The thread executes the function
{function} with the argument list {args} (which must be a tuple). The optional
{kwargs} argument specifies a dictionary of keyword arguments. When the function
returns, the thread silently exits. When the function terminates with an
unhandled exception, a stack trace is printed and then the thread exits (but
other threads continue to run).
interrupt_main()~
Raise a KeyboardInterrupt exception in the main thread. A subthread can
use this function to interrupt the main thread.
.. versionadded:: 2.3
exit()~
Raise the SystemExit exception. When not caught, this will cause the
thread to exit silently.
..
function:: exit_prog(status)
Exit all threads and report the value of the integer argument
{status} as the exit status of the entire program.
{Caveat:}* code in pending finally clauses, in this thread
or in other threads, is not executed.
allocate_lock()~
Return a new lock object. Methods of locks are described below. The lock is
initially unlocked.
get_ident()~
Return the 'thread identifier' of the current thread. This is a nonzero
integer. Its value has no direct meaning; it is intended as a magic cookie to
be used e.g. to index a dictionary of thread-specific data. Thread identifiers
may be recycled when a thread exits and another thread is created.
stack_size([size])~
Return the thread stack size used when creating new threads. The optional
{size} argument specifies the stack size to be used for subsequently created
threads, and must be 0 (use platform or configured default) or a positive
integer value of at least 32,768 (32kB). If changing the thread stack size is
unsupported, the error exception is raised. If the specified stack size is
invalid, a ValueError is raised and the stack size is unmodified. 32kB
is currently the minimum supported stack size value to guarantee sufficient
stack space for the interpreter itself. Note that some platforms may have
particular restrictions on values for the stack size, such as requiring a
minimum stack size > 32kB or requiring allocation in multiples of the system
memory page size - platform documentation should be referred to for more
information (4kB pages are common; using multiples of 4096 for the stack size is
the suggested approach in the absence of more specific information).
Availability: Windows, systems with POSIX threads.
.. versionadded:: 2.5
Lock objects have the following methods:
lock.acquire([waitflag])~
Without the optional argument, this method acquires the lock unconditionally, if
necessary waiting until it is released by another thread (only one thread at a
time can acquire a lock --- that's their reason for existence). If the integer
{waitflag} argument is present, the action depends on its value: if it is zero,
the lock is only acquired if it can be acquired immediately without waiting,
while if it is nonzero, the lock is acquired unconditionally as before. The
return value is ``True`` if the lock is acquired successfully, ``False`` if not.
lock.release()~
Releases the lock. The lock must have been acquired earlier, but not
necessarily by the same thread.
lock.locked()~
Return the status of the lock: ``True`` if it has been acquired by some thread,
``False`` if not.
In addition to these methods, lock objects can also be used via the
with statement, e.g.:: >
import thread
a_lock = thread.allocate_lock()
with a_lock:
print "a_lock is locked while this executes"
<
{Caveats:}*
.. index:: module: signal
* Threads interact strangely with interrupts: the KeyboardInterrupt
exception will be received by an arbitrary thread. (When the signal (|py2stdlib-signal|)
module is available, interrupts always go to the main thread.)
* Calling sys.exit or raising the SystemExit exception is
equivalent to calling thread.exit.
* Not all built-in functions that may block waiting for I/O allow other threads
to run. (The most popular ones (time.sleep, file.read,
select.select) work as expected.)
* It is not possible to interrupt the acquire method on a lock --- the
KeyboardInterrupt exception will happen after the lock has been acquired.
.. index:: pair: threads; IRIX
* When the main thread exits, it is system defined whether the other threads
survive. On SGI IRIX using the native thread implementation, they survive. On
most other systems, they are killed without executing try ...
finally clauses or executing object destructors.
* When the main thread exits, it does not do any of its usual cleanup (except
that try ... finally clauses are honored), and the
standard I/O files are not flushed.
==============================================================================
*py2stdlib-threading*
threading~
:synopsis: Higher-level threading interface.
This module constructs higher-level threading interfaces on top of the lower
level thread (|py2stdlib-thread|) module.
See also the mutex (|py2stdlib-mutex|) and Queue (|py2stdlib-queue|) modules.
The dummy_threading (|py2stdlib-dummy_threading|) module is provided for situations where
threading (|py2stdlib-threading|) cannot be used because thread (|py2stdlib-thread|) is missing.
.. note::
Starting with Python 2.6, this module provides 8 compliant aliases and
properties to replace the ``camelCase`` names that were inspired by Java's
threading API. This updated API is compatible with that of the
multiprocessing (|py2stdlib-multiprocessing|) module. However, no schedule has been set for the
deprecation of the ``camelCase`` names and they remain fully supported in
both Python 2.x and 3.x.
.. note::
Starting with Python 2.5, several Thread methods raise RuntimeError
instead of AssertionError if called erroneously.
This module defines the following functions and objects:
active_count()~
activeCount()
Return the number of Thread objects currently alive. The returned
count is equal to the length of the list returned by .enumerate.
Condition()~
A factory function that returns a new condition variable object. A condition
variable allows one or more threads to wait until they are notified by another
thread.
current_thread()~
currentThread()
Return the current Thread object, corresponding to the caller's thread
of control. If the caller's thread of control was not created through the
threading (|py2stdlib-threading|) module, a dummy thread object with limited functionality is
returned.
enumerate()~
Return a list of all Thread objects currently alive. The list
includes daemonic threads, dummy thread objects created by
current_thread, and the main thread. It excludes terminated threads
and threads that have not yet been started.
Event()~
A factory function that returns a new event object. An event manages a flag
that can be set to true with the Event.set method and reset to false
with the clear method. The wait method blocks until the flag
is true.
local~
A class that represents thread-local data. Thread-local data are data whose
values are thread specific. To manage thread-local data, just create an
instance of local (or a subclass) and store attributes on it:: >
mydata = threading.local()
mydata.x = 1
<
The instance's values will be different for separate threads.
For more details and extensive examples, see the documentation string of the
_threading_local module.
.. versionadded:: 2.4
Lock()~
A factory function that returns a new primitive lock object. Once a thread has
acquired it, subsequent attempts to acquire it block, until it is released; any
thread may release it.
RLock()~
A factory function that returns a new reentrant lock object. A reentrant lock
must be released by the thread that acquired it. Once a thread has acquired a
reentrant lock, the same thread may acquire it again without blocking; the
thread must release it once for each time it has acquired it.
Semaphore([value])~
A factory function that returns a new semaphore object. A semaphore manages a
counter representing the number of release calls minus the number of
acquire calls, plus an initial value. The acquire method blocks
if necessary until it can return without making the counter negative. If not
given, {value} defaults to 1.
BoundedSemaphore([value])~
A factory function that returns a new bounded semaphore object. A bounded
semaphore checks to make sure its current value doesn't exceed its initial
value. If it does, ValueError is raised. In most situations semaphores
are used to guard resources with limited capacity. If the semaphore is released
too many times it's a sign of a bug. If not given, {value} defaults to 1.
Thread~
A class that represents a thread of control. This class can be safely
subclassed in a limited fashion.
Timer~
A thread that executes a function after a specified interval has passed.
settrace(func)~
.. index:: single: trace function
Set a trace function for all threads started from the threading (|py2stdlib-threading|) module.
The {func} will be passed to sys.settrace for each thread, before its
run method is called.
.. versionadded:: 2.3
setprofile(func)~
.. index:: single: profile function
Set a profile function for all threads started from the threading (|py2stdlib-threading|) module.
The {func} will be passed to sys.setprofile for each thread, before its
run method is called.
.. versionadded:: 2.3
stack_size([size])~
Return the thread stack size used when creating new threads. The optional
{size} argument specifies the stack size to be used for subsequently created
threads, and must be 0 (use platform or configured default) or a positive
integer value of at least 32,768 (32kB). If changing the thread stack size is
unsupported, a ThreadError is raised. If the specified stack size is
invalid, a ValueError is raised and the stack size is unmodified. 32kB
is currently the minimum supported stack size value to guarantee sufficient
stack space for the interpreter itself. Note that some platforms may have
particular restrictions on values for the stack size, such as requiring a
minimum stack size > 32kB or requiring allocation in multiples of the system
memory page size - platform documentation should be referred to for more
information (4kB pages are common; using multiples of 4096 for the stack size is
the suggested approach in the absence of more specific information).
Availability: Windows, systems with POSIX threads.
.. versionadded:: 2.5
Detailed interfaces for the objects are documented below.
The design of this module is loosely based on Java's threading model. However,
where Java makes locks and condition variables basic behavior of every object,
they are separate objects in Python. Python's Thread class supports a
subset of the behavior of Java's Thread class; currently, there are no
priorities, no thread groups, and threads cannot be destroyed, stopped,
suspended, resumed, or interrupted. The static methods of Java's Thread class,
when implemented, are mapped to module-level functions.
All of the methods described below are executed atomically.
Thread Objects
--------------
This class represents an activity that is run in a separate thread of control.
There are two ways to specify the activity: by passing a callable object to the
constructor, or by overriding the run method in a subclass. No other
methods (except for the constructor) should be overridden in a subclass. In
other words, {only} override the __init__ and run methods of
this class.
Once a thread object is created, its activity must be started by calling the
thread's start method. This invokes the run method in a
separate thread of control.
Once the thread's activity is started, the thread is considered 'alive'. It
stops being alive when its run method terminates -- either normally, or
by raising an unhandled exception. The is_alive method tests whether the
thread is alive.
Other threads can call a thread's join method. This blocks the calling
thread until the thread whose join method is called is terminated.
A thread has a name. The name can be passed to the constructor, and read or
changed through the name attribute.
A thread can be flagged as a "daemon thread". The significance of this flag is
that the entire Python program exits when only daemon threads are left. The
initial value is inherited from the creating thread. The flag can be set
through the daemon property.
There is a "main thread" object; this corresponds to the initial thread of
control in the Python program. It is not a daemon thread.
There is the possibility that "dummy thread objects" are created. These are
thread objects corresponding to "alien threads", which are threads of control
started outside the threading module, such as directly from C code. Dummy
thread objects have limited functionality; they are always considered alive and
daemonic, and cannot be join\ ed. They are never deleted, since it is
impossible to detect the termination of alien threads.
Thread(group=None, target=None, name=None, args=(), kwargs={})~
This constructor should always be called with keyword arguments. Arguments
are:
{group} should be ``None``; reserved for future extension when a
ThreadGroup class is implemented.
{target} is the callable object to be invoked by the run method.
Defaults to ``None``, meaning nothing is called.
{name} is the thread name. By default, a unique name is constructed of the
form "Thread-{N}" where {N} is a small decimal number.
{args} is the argument tuple for the target invocation. Defaults to ``()``.
{kwargs} is a dictionary of keyword arguments for the target invocation.
Defaults to ``{}``.
If the subclass overrides the constructor, it must make sure to invoke the
base class constructor (``Thread.__init__()``) before doing anything else to
the thread.
start()~
Start the thread's activity.
It must be called at most once per thread object. It arranges for the
object's run method to be invoked in a separate thread of control.
This method will raise a RuntimeException if called more than once
on the same thread object.
run()~
Method representing the thread's activity.
You may override this method in a subclass. The standard run
method invokes the callable object passed to the object's constructor as
the {target} argument, if any, with sequential and keyword arguments taken
from the {args} and {kwargs} arguments, respectively.
join([timeout])~
Wait until the thread terminates. This blocks the calling thread until the
thread whose join method is called terminates -- either normally
or through an unhandled exception -- or until the optional timeout occurs.
When the {timeout} argument is present and not ``None``, it should be a
floating point number specifying a timeout for the operation in seconds
(or fractions thereof). As join always returns ``None``, you must
call isAlive after join to decide whether a timeout
happened -- if the thread is still alive, the join call timed out.
When the {timeout} argument is not present or ``None``, the operation will
block until the thread terminates.
A thread can be join\ ed many times.
join raises a RuntimeError if an attempt is made to join
the current thread as that would cause a deadlock. It is also an error to
join a thread before it has been started and attempts to do so
raises the same exception.
getName()~
setName()
Old API for Thread.name.
name~
A string used for identification purposes only. It has no semantics.
Multiple threads may be given the same name. The initial name is set by
the constructor.
ident~
The 'thread identifier' of this thread or ``None`` if the thread has not
been started. This is a nonzero integer. See the
thread.get_ident() function. Thread identifiers may be recycled
when a thread exits and another thread is created. The identifier is
available even after the thread has exited.
.. versionadded:: 2.6
is_alive()~
isAlive()
Return whether the thread is alive.
Roughly, a thread is alive from the moment the start method
returns until its run method terminates. The module function
.enumerate returns a list of all alive threads.
isDaemon()~
setDaemon()
Old API for Thread.daemon.
daemon~
A boolean value indicating whether this thread is a daemon thread (True)
or not (False). This must be set before start is called,
otherwise RuntimeError is raised. Its initial value is inherited
from the creating thread; the main thread is not a daemon thread and
therefore all threads created in the main thread default to daemon
= ``False``.
The entire Python program exits when no alive non-daemon threads are left.
Lock Objects
------------
A primitive lock is a synchronization primitive that is not owned by a
particular thread when locked. In Python, it is currently the lowest level
synchronization primitive available, implemented directly by the thread (|py2stdlib-thread|)
extension module.
A primitive lock is in one of two states, "locked" or "unlocked". It is created
in the unlocked state. It has two basic methods, acquire and
release. When the state is unlocked, acquire changes the state
to locked and returns immediately. When the state is locked, acquire
blocks until a call to release in another thread changes it to unlocked,
then the acquire call resets it to locked and returns. The
release method should only be called in the locked state; it changes the
state to unlocked and returns immediately. If an attempt is made to release an
unlocked lock, a RuntimeError will be raised.
When more than one thread is blocked in acquire waiting for the state to
turn to unlocked, only one thread proceeds when a release call resets
the state to unlocked; which one of the waiting threads proceeds is not defined,
and may vary across implementations.
All methods are executed atomically.
Lock.acquire([blocking=1])~
Acquire a lock, blocking or non-blocking.
When invoked without arguments, block until the lock is unlocked, then set it to
locked, and return true.
When invoked with the {blocking} argument set to true, do the same thing as when
called without arguments, and return true.
When invoked with the {blocking} argument set to false, do not block. If a call
without an argument would block, return false immediately; otherwise, do the
same thing as when called without arguments, and return true.
Lock.release()~
Release a lock.
When the lock is locked, reset it to unlocked, and return. If any other threads
are blocked waiting for the lock to become unlocked, allow exactly one of them
to proceed.
Do not call this method when the lock is unlocked.
There is no return value.
RLock Objects
-------------
A reentrant lock is a synchronization primitive that may be acquired multiple
times by the same thread. Internally, it uses the concepts of "owning thread"
and "recursion level" in addition to the locked/unlocked state used by primitive
locks. In the locked state, some thread owns the lock; in the unlocked state,
no thread owns it.
To lock the lock, a thread calls its acquire method; this returns once
the thread owns the lock. To unlock the lock, a thread calls its
release method. acquire/release call pairs may be
nested; only the final release (the release of the outermost
pair) resets the lock to unlocked and allows another thread blocked in
acquire to proceed.
RLock.acquire([blocking=1])~
Acquire a lock, blocking or non-blocking.
When invoked without arguments: if this thread already owns the lock, increment
the recursion level by one, and return immediately. Otherwise, if another
thread owns the lock, block until the lock is unlocked. Once the lock is
unlocked (not owned by any thread), then grab ownership, set the recursion level
to one, and return. If more than one thread is blocked waiting until the lock
is unlocked, only one at a time will be able to grab ownership of the lock.
There is no return value in this case.
When invoked with the {blocking} argument set to true, do the same thing as when
called without arguments, and return true.
When invoked with the {blocking} argument set to false, do not block. If a call
without an argument would block, return false immediately; otherwise, do the
same thing as when called without arguments, and return true.
RLock.release()~
Release a lock, decrementing the recursion level. If after the decrement it is
zero, reset the lock to unlocked (not owned by any thread), and if any other
threads are blocked waiting for the lock to become unlocked, allow exactly one
of them to proceed. If after the decrement the recursion level is still
nonzero, the lock remains locked and owned by the calling thread.
Only call this method when the calling thread owns the lock. A
RuntimeError is raised if this method is called when the lock is
unlocked.
There is no return value.
Condition Objects
-----------------
A condition variable is always associated with some kind of lock; this can be
passed in or one will be created by default. (Passing one in is useful when
several condition variables must share the same lock.)
A condition variable has acquire and release methods that call
the corresponding methods of the associated lock. It also has a wait
method, and notify and notifyAll methods. These three must only
be called when the calling thread has acquired the lock, otherwise a
RuntimeError is raised.
The wait method releases the lock, and then blocks until it is awakened
by a notify or notifyAll call for the same condition variable in
another thread. Once awakened, it re-acquires the lock and returns. It is also
possible to specify a timeout.
The notify method wakes up one of the threads waiting for the condition
variable, if any are waiting. The notifyAll method wakes up all threads
waiting for the condition variable.
Note: the notify and notifyAll methods don't release the lock;
this means that the thread or threads awakened will not return from their
wait call immediately, but only when the thread that called
notify or notifyAll finally relinquishes ownership of the lock.
Tip: the typical programming style using condition variables uses the lock to
synchronize access to some shared state; threads that are interested in a
particular change of state call wait repeatedly until they see the
desired state, while threads that modify the state call notify or
notifyAll when they change the state in such a way that it could
possibly be a desired state for one of the waiters. For example, the following
code is a generic producer-consumer situation with unlimited buffer capacity:: >
# Consume one item
cv.acquire()
while not an_item_is_available():
cv.wait()
get_an_available_item()
cv.release()
# Produce one item
cv.acquire()
make_an_item_available()
cv.notify()
cv.release()
<
To choose between notify and notifyAll, consider whether one
state change can be interesting for only one or several waiting threads. E.g.
in a typical producer-consumer situation, adding one item to the buffer only
needs to wake up one consumer thread.
Condition([lock])~
If the {lock} argument is given and not ``None``, it must be a Lock
or RLock object, and it is used as the underlying lock. Otherwise,
a new RLock object is created and used as the underlying lock.
acquire(*args)~
Acquire the underlying lock. This method calls the corresponding method on
the underlying lock; the return value is whatever that method returns.
release()~
Release the underlying lock. This method calls the corresponding method on
the underlying lock; there is no return value.
wait([timeout])~
Wait until notified or until a timeout occurs. If the calling thread has not
acquired the lock when this method is called, a RuntimeError is raised.
This method releases the underlying lock, and then blocks until it is
awakened by a notify or notifyAll call for the same
condition variable in another thread, or until the optional timeout
occurs. Once awakened or timed out, it re-acquires the lock and returns.
When the {timeout} argument is present and not ``None``, it should be a
floating point number specifying a timeout for the operation in seconds
(or fractions thereof).
When the underlying lock is an RLock, it is not released using
its release method, since this may not actually unlock the lock
when it was acquired multiple times recursively. Instead, an internal
interface of the RLock class is used, which really unlocks it
even when it has been recursively acquired several times. Another internal
interface is then used to restore the recursion level when the lock is
reacquired.
notify()~
Wake up a thread waiting on this condition, if any. If the calling thread
has not acquired the lock when this method is called, a
RuntimeError is raised.
This method wakes up one of the threads waiting for the condition
variable, if any are waiting; it is a no-op if no threads are waiting.
The current implementation wakes up exactly one thread, if any are
waiting. However, it's not safe to rely on this behavior. A future,
optimized implementation may occasionally wake up more than one thread.
Note: the awakened thread does not actually return from its wait
call until it can reacquire the lock. Since notify does not
release the lock, its caller should.
notify_all()~
notifyAll()
Wake up all threads waiting on this condition. This method acts like
notify, but wakes up all waiting threads instead of one. If the
calling thread has not acquired the lock when this method is called, a
RuntimeError is raised.
Semaphore Objects
-----------------
This is one of the oldest synchronization primitives in the history of computer
science, invented by the early Dutch computer scientist Edsger W. Dijkstra (he
used P and V instead of acquire and release).
A semaphore manages an internal counter which is decremented by each
acquire call and incremented by each release call. The counter
can never go below zero; when acquire finds that it is zero, it blocks,
waiting until some other thread calls release.
Semaphore([value])~
The optional argument gives the initial {value} for the internal counter; it
defaults to ``1``. If the {value} given is less than 0, ValueError is
raised.
acquire([blocking])~
Acquire a semaphore.
When invoked without arguments: if the internal counter is larger than
zero on entry, decrement it by one and return immediately. If it is zero
on entry, block, waiting until some other thread has called
release to make it larger than zero. This is done with proper
interlocking so that if multiple acquire calls are blocked,
release will wake exactly one of them up. The implementation may
pick one at random, so the order in which blocked threads are awakened
should not be relied on. There is no return value in this case.
When invoked with {blocking} set to true, do the same thing as when called
without arguments, and return true.
When invoked with {blocking} set to false, do not block. If a call
without an argument would block, return false immediately; otherwise, do
the same thing as when called without arguments, and return true.
release()~
Release a semaphore, incrementing the internal counter by one. When it
was zero on entry and another thread is waiting for it to become larger
than zero again, wake up that thread.
Semaphore Example
^^^^^^^^^^^^^^^^^^^^^^^^^^
Semaphores are often used to guard resources with limited capacity, for example,
a database server. In any situation where the size of the resource size is
fixed, you should use a bounded semaphore. Before spawning any worker threads,
your main thread would initialize the semaphore:: >
maxconnections = 5
...
pool_sema = BoundedSemaphore(value=maxconnections)
<
Once spawned, worker threads call the semaphore's acquire and release methods
when they need to connect to the server:: >
pool_sema.acquire()
conn = connectdb()
... use connection ...
conn.close()
pool_sema.release()
<
The use of a bounded semaphore reduces the chance that a programming error which
causes the semaphore to be released more than it's acquired will go undetected.
Event Objects
-------------
This is one of the simplest mechanisms for communication between threads: one
thread signals an event and other threads wait for it.
An event object manages an internal flag that can be set to true with the
Event.set method and reset to false with the clear method. The
wait method blocks until the flag is true.
Event()~
The internal flag is initially false.
is_set()~
isSet()
Return true if and only if the internal flag is true.
.. versionchanged:: 2.6
The ``is_set()`` syntax is new.
set()~
Set the internal flag to true. All threads waiting for it to become true
are awakened. Threads that call wait once the flag is true will
not block at all.
clear()~
Reset the internal flag to false. Subsequently, threads calling
wait will block until .set is called to set the internal
flag to true again.
wait([timeout])~
Block until the internal flag is true. If the internal flag is true on
entry, return immediately. Otherwise, block until another thread calls
.set to set the flag to true, or until the optional timeout
occurs.
When the timeout argument is present and not ``None``, it should be a
floating point number specifying a timeout for the operation in seconds
(or fractions thereof).
This method returns the internal flag on exit, so it will always return
``True`` except if a timeout is given and the operation times out.
.. versionchanged:: 2.7
Previously, the method always returned ``None``.
Timer Objects
-------------
This class represents an action that should be run only after a certain amount
of time has passed --- a timer. Timer is a subclass of Thread
and as such also functions as an example of creating custom threads.
Timers are started, as with threads, by calling their start method. The
timer can be stopped (before its action has begun) by calling the cancel
method. The interval the timer will wait before executing its action may not be
exactly the same as the interval specified by the user.
For example:: >
def hello():
print "hello, world"
t = Timer(30.0, hello)
t.start() # after 30 seconds, "hello, world" will be printed
<
Timer(interval, function, args=[], kwargs={})~
Create a timer that will run {function} with arguments {args} and keyword
arguments {kwargs}, after {interval} seconds have passed.
cancel()~
Stop the timer, and cancel the execution of the timer's action. This will
only work if the timer is still in its waiting stage.
Using locks, conditions, and semaphores in the with statement
------------------------------------------------------------------------
All of the objects provided by this module that have acquire and
release methods can be used as context managers for a with
statement. The acquire method will be called when the block is entered,
and release will be called when the block is exited.
Currently, Lock, RLock, Condition,
Semaphore, and BoundedSemaphore objects may be used as
with statement context managers. For example:: >
import threading
some_rlock = threading.RLock()
with some_rlock:
print "some_rlock is locked while this executes"
<
Importing in threaded code
While the import machinery is thread safe, there are two key
restrictions on threaded imports due to inherent limitations in the way
that thread safety is provided:
* Firstly, other than in the main module, an import should not have the
side effect of spawning a new thread and then waiting for that thread in
any way. Failing to abide by this restriction can lead to a deadlock if
the spawned thread directly or indirectly attempts to import a module.
* Secondly, all import attempts must be completed before the interpreter
starts shutting itself down. This can be most easily achieved by only
performing imports from non-daemon threads created through the threading
module. Daemon threads and threads created directly with the thread
module will require some other form of synchronization to ensure they do
not attempt imports after system shutdown has commenced. Failure to
abide by this restriction will lead to intermittent exceptions and
crashes during interpreter shutdown (as the late imports attempt to
access machinery which is no longer in a valid state).
==============================================================================
*py2stdlib-time*
time~
:synopsis: Time access and conversions.
This module provides various time-related functions. For related
functionality, see also the datetime (|py2stdlib-datetime|) and calendar (|py2stdlib-calendar|) modules.
Although this module is always available,
not all functions are available on all platforms. Most of the functions
defined in this module call platform C library functions with the same name. It
may sometimes be helpful to consult the platform documentation, because the
semantics of these functions varies among platforms.
An explanation of some terminology and conventions is in order.
.. index:: single: epoch
* The epoch is the point where the time starts. On January 1st of that
year, at 0 hours, the "time since the epoch" is zero. For Unix, the epoch is
1970. To find out what the epoch is, look at ``gmtime(0)``.
.. index:: single: Year 2038
* The functions in this module do not handle dates and times before the epoch or
far in the future. The cut-off point in the future is determined by the C
library; for Unix, it is typically in 2038.
.. index::
single: Year 2000
single: Y2K
{ }{Year 2000 (Y2K) issues}*: Python depends on the platform's C library, which
generally doesn't have year 2000 issues, since all dates and times are
represented internally as seconds since the epoch. Functions accepting a
struct_time (see below) generally require a 4-digit year. For backward
compatibility, 2-digit years are supported if the module variable
``accept2dyear`` is a non-zero integer; this variable is initialized to ``1``
unless the environment variable PYTHONY2K is set to a non-empty
string, in which case it is initialized to ``0``. Thus, you can set
PYTHONY2K to a non-empty string in the environment to require 4-digit
years for all year input. When 2-digit years are accepted, they are converted
according to the POSIX or X/Open standard: values 69-99 are mapped to 1969-1999,
and values 0--68 are mapped to 2000--2068. Values 100--1899 are always illegal.
Note that this is new as of Python 1.5.2(a2); earlier versions, up to Python
1.5.1 and 1.5.2a1, would add 1900 to year values below 1900.
.. index::
single: UTC
single: Coordinated Universal Time
single: Greenwich Mean Time
* UTC is Coordinated Universal Time (formerly known as Greenwich Mean Time, or
GMT). The acronym UTC is not a mistake but a compromise between English and
French.
.. index:: single: Daylight Saving Time
* DST is Daylight Saving Time, an adjustment of the timezone by (usually) one
hour during part of the year. DST rules are magic (determined by local law) and
can change from year to year. The C library has a table containing the local
rules (often it is read from a system file for flexibility) and is the only
source of True Wisdom in this respect.
* The precision of the various real-time functions may be less than suggested by
the units in which their value or argument is expressed. E.g. on most Unix
systems, the clock "ticks" only 50 or 100 times a second.
* On the other hand, the precision of time (|py2stdlib-time|) and sleep is better
than their Unix equivalents: times are expressed as floating point numbers,
time (|py2stdlib-time|) returns the most accurate time available (using Unix
gettimeofday where available), and sleep will accept a time
with a nonzero fraction (Unix select (|py2stdlib-select|) is used to implement this, where
available).
* The time value as returned by gmtime, localtime, and
strptime, and accepted by asctime, mktime and
strftime, may be considered as a sequence of 9 integers. The return
values of gmtime, localtime, and strptime also offer
attribute names for individual fields.
+-------+-------------------+---------------------------------+
| Index | Attribute | Values |
+=======+===================+=================================+
| 0 | tm_year | (for example, 1993) |
+-------+-------------------+---------------------------------+
| 1 | tm_mon | range [1, 12] |
+-------+-------------------+---------------------------------+
| 2 | tm_mday | range [1, 31] |
+-------+-------------------+---------------------------------+
| 3 | tm_hour | range [0, 23] |
+-------+-------------------+---------------------------------+
| 4 | tm_min | range [0, 59] |
+-------+-------------------+---------------------------------+
| 5 | tm_sec | range [0, 61]; see {(1)}* in |
| | | strftime description |
+-------+-------------------+---------------------------------+
| 6 | tm_wday | range [0, 6], Monday is 0 |
+-------+-------------------+---------------------------------+
| 7 | tm_yday | range [1, 366] |
+-------+-------------------+---------------------------------+
| 8 | tm_isdst | 0, 1 or -1; see below |
+-------+-------------------+---------------------------------+
Note that unlike the C structure, the month value is a range of [1, 12],
not [0, 11].
A year value will be handled as described under "Year 2000 (Y2K) issues" above.
A ``-1`` argument as the daylight savings flag, passed to mktime will
usually result in the correct daylight savings state to be filled in.
When a tuple with an incorrect length is passed to a function expecting a
struct_time, or having elements of the wrong type, a TypeError
is raised.
.. versionchanged:: 2.2
The time value sequence was changed from a tuple to a struct_time, with
the addition of attribute names for the fields.
* Use the following functions to convert between time representations:
+-------------------------+-------------------------+-------------------------+
| From | To | Use |
+=========================+=========================+=========================+
| seconds since the epoch | struct_time in | gmtime |
| | UTC | |
+-------------------------+-------------------------+-------------------------+
| seconds since the epoch | struct_time in | localtime |
| | local time | |
+-------------------------+-------------------------+-------------------------+
| struct_time in | seconds since the epoch | calendar.timegm |
| UTC | | |
+-------------------------+-------------------------+-------------------------+
| struct_time in | seconds since the epoch | mktime |
| local time | | |
+-------------------------+-------------------------+-------------------------+
The module defines the following functions and data items:
accept2dyear~
Boolean value indicating whether two-digit year values will be accepted. This
is true by default, but will be set to false if the environment variable
PYTHONY2K has been set to a non-empty string. It may also be modified
at run time.
altzone~
The offset of the local DST timezone, in seconds west of UTC, if one is defined.
This is negative if the local DST timezone is east of UTC (as in Western Europe,
including the UK). Only use this if ``daylight`` is nonzero.
asctime([t])~
Convert a tuple or struct_time representing a time as returned by
gmtime or localtime to a 24-character string of the following
form: ``'Sun Jun 20 23:21:05 1993'``. If {t} is not provided, the current time
as returned by localtime is used. Locale information is not used by
asctime.
.. note:: >
Unlike the C function of the same name, there is no trailing newline.
<
.. versionchanged:: 2.1
Allowed {t} to be omitted.
clock()~
.. index::
single: CPU time
single: processor time
single: benchmarking
On Unix, return the current processor time as a floating point number expressed
in seconds. The precision, and in fact the very definition of the meaning of
"processor time", depends on that of the C function of the same name, but in any
case, this is the function to use for benchmarking Python or timing algorithms.
On Windows, this function returns wall-clock seconds elapsed since the first
call to this function, as a floating point number, based on the Win32 function
QueryPerformanceCounter. The resolution is typically better than one
microsecond.
ctime([secs])~
Convert a time expressed in seconds since the epoch to a string representing
local time. If {secs} is not provided or None, the current time as
returned by time (|py2stdlib-time|) is used. ``ctime(secs)`` is equivalent to
``asctime(localtime(secs))``. Locale information is not used by ctime.
.. versionchanged:: 2.1
Allowed {secs} to be omitted.
.. versionchanged:: 2.4
If {secs} is None, the current time is used.
daylight~
Nonzero if a DST timezone is defined.
gmtime([secs])~
Convert a time expressed in seconds since the epoch to a struct_time in
UTC in which the dst flag is always zero. If {secs} is not provided or
None, the current time as returned by time (|py2stdlib-time|) is used. Fractions
of a second are ignored. See above for a description of the
struct_time object. See calendar.timegm for the inverse of this
function.
.. versionchanged:: 2.1
Allowed {secs} to be omitted.
.. versionchanged:: 2.4
If {secs} is None, the current time is used.
localtime([secs])~
Like gmtime but converts to local time. If {secs} is not provided or
None, the current time as returned by time (|py2stdlib-time|) is used. The dst
flag is set to ``1`` when DST applies to the given time.
.. versionchanged:: 2.1
Allowed {secs} to be omitted.
.. versionchanged:: 2.4
If {secs} is None, the current time is used.
mktime(t)~
This is the inverse function of localtime. Its argument is the
struct_time or full 9-tuple (since the dst flag is needed; use ``-1``
as the dst flag if it is unknown) which expresses the time in {local} time, not
UTC. It returns a floating point number, for compatibility with time (|py2stdlib-time|).
If the input value cannot be represented as a valid time, either
OverflowError or ValueError will be raised (which depends on
whether the invalid value is caught by Python or the underlying C libraries).
The earliest date for which it can generate a time is platform-dependent.
sleep(secs)~
Suspend execution for the given number of seconds. The argument may be a
floating point number to indicate a more precise sleep time. The actual
suspension time may be less than that requested because any caught signal will
terminate the sleep following execution of that signal's catching
routine. Also, the suspension time may be longer than requested by an arbitrary
amount because of the scheduling of other activity in the system.
strftime(format[, t])~
Convert a tuple or struct_time representing a time as returned by
gmtime or localtime to a string as specified by the {format}
argument. If {t} is not provided, the current time as returned by
localtime is used. {format} must be a string. ValueError is
raised if any field in {t} is outside of the allowed range.
.. versionchanged:: 2.1
Allowed {t} to be omitted.
.. versionchanged:: 2.4
ValueError raised if a field in {t} is out of range.
.. versionchanged:: 2.5
0 is now a legal argument for any position in the time tuple; if it is normally
illegal the value is forced to a correct one..
The following directives can be embedded in the {format} string. They are shown
without the optional field width and precision specification, and are replaced
by the indicated characters in the strftime result:
+-----------+--------------------------------+-------+
| Directive | Meaning | Notes |
+===========+================================+=======+
| ``%a`` | Locale's abbreviated weekday | |
| | name. | |
+-----------+--------------------------------+-------+
| ``%A`` | Locale's full weekday name. | |
+-----------+--------------------------------+-------+
| ``%b`` | Locale's abbreviated month | |
| | name. | |
+-----------+--------------------------------+-------+
| ``%B`` | Locale's full month name. | |
+-----------+--------------------------------+-------+
| ``%c`` | Locale's appropriate date and | |
| | time representation. | |
+-----------+--------------------------------+-------+
| ``%d`` | Day of the month as a decimal | |
| | number [01,31]. | |
+-----------+--------------------------------+-------+
| ``%H`` | Hour (24-hour clock) as a | |
| | decimal number [00,23]. | |
+-----------+--------------------------------+-------+
| ``%I`` | Hour (12-hour clock) as a | |
| | decimal number [01,12]. | |
+-----------+--------------------------------+-------+
| ``%j`` | Day of the year as a decimal | |
| | number [001,366]. | |
+-----------+--------------------------------+-------+
| ``%m`` | Month as a decimal number | |
| | [01,12]. | |
+-----------+--------------------------------+-------+
| ``%M`` | Minute as a decimal number | |
| | [00,59]. | |
+-----------+--------------------------------+-------+
| ``%p`` | Locale's equivalent of either | \(1) |
| | AM or PM. | |
+-----------+--------------------------------+-------+
| ``%S`` | Second as a decimal number | \(2) |
| | [00,61]. | |
+-----------+--------------------------------+-------+
| ``%U`` | Week number of the year | \(3) |
| | (Sunday as the first day of | |
| | the week) as a decimal number | |
| | [00,53]. All days in a new | |
| | year preceding the first | |
| | Sunday are considered to be in | |
| | week 0. | |
+-----------+--------------------------------+-------+
| ``%w`` | Weekday as a decimal number | |
| | [0(Sunday),6]. | |
+-----------+--------------------------------+-------+
| ``%W`` | Week number of the year | \(3) |
| | (Monday as the first day of | |
| | the week) as a decimal number | |
| | [00,53]. All days in a new | |
| | year preceding the first | |
| | Monday are considered to be in | |
| | week 0. | |
+-----------+--------------------------------+-------+
| ``%x`` | Locale's appropriate date | |
| | representation. | |
+-----------+--------------------------------+-------+
| ``%X`` | Locale's appropriate time | |
| | representation. | |
+-----------+--------------------------------+-------+
| ``%y`` | Year without century as a | |
| | decimal number [00,99]. | |
+-----------+--------------------------------+-------+
| ``%Y`` | Year with century as a decimal | |
| | number. | |
+-----------+--------------------------------+-------+
| ``%Z`` | Time zone name (no characters | |
| | if no time zone exists). | |
+-----------+--------------------------------+-------+
| ``%%`` | A literal ``'%'`` character. | |
+-----------+--------------------------------+-------+
Notes:
(1)
When used with the strptime function, the ``%p`` directive only affects
the output hour field if the ``%I`` directive is used to parse the hour.
(2)
The range really is ``0`` to ``61``; this accounts for leap seconds and the
(very rare) double leap seconds.
(3)
When used with the strptime function, ``%U`` and ``%W`` are only used in
calculations when the day of the week and the year are specified.
Here is an example, a format for dates compatible with that specified in the
2822 Internet email standard. [#]_ :: >
>>> from time import gmtime, strftime
>>> strftime("%a, %d %b %Y %H:%M:%S +0000", gmtime())
'Thu, 28 Jun 2001 14:17:15 +0000'
<
Additional directives may be supported on certain platforms, but only the ones
listed here have a meaning standardized by ANSI C.
On some platforms, an optional field width and precision specification can
immediately follow the initial ``'%'`` of a directive in the following order;
this is also not portable. The field width is normally 2 except for ``%j`` where
it is 3.
strptime(string[, format])~
Parse a string representing a time according to a format. The return value is
a struct_time as returned by gmtime or localtime.
The {format} parameter uses the same directives as those used by
strftime; it defaults to ``"%a %b %d %H:%M:%S %Y"`` which matches the
formatting returned by ctime. If {string} cannot be parsed according to
{format}, or if it has excess data after parsing, ValueError is raised.
The default values used to fill in any missing data when more accurate values
cannot be inferred are ``(1900, 1, 1, 0, 0, 0, 0, 1, -1)``.
For example:
>>> import time
>>> time.strptime("30 Nov 00", "%d %b %y") # doctest: +NORMALIZE_WHITESPACE
time.struct_time(tm_year=2000, tm_mon=11, tm_mday=30, tm_hour=0, tm_min=0,
tm_sec=0, tm_wday=3, tm_yday=335, tm_isdst=-1)
Support for the ``%Z`` directive is based on the values contained in ``tzname``
and whether ``daylight`` is true. Because of this, it is platform-specific
except for recognizing UTC and GMT which are always known (and are considered to
be non-daylight savings timezones).
Only the directives specified in the documentation are supported. Because
``strftime()`` is implemented per platform it can sometimes offer more
directives than those listed. But ``strptime()`` is independent of any platform
and thus does not necessarily support all directives available that are not
documented as supported.
struct_time~
The type of the time value sequence returned by gmtime,
localtime, and strptime.
.. versionadded:: 2.2
time()~
Return the time as a floating point number expressed in seconds since the epoch,
in UTC. Note that even though the time is always returned as a floating point
number, not all systems provide time with a better precision than 1 second.
While this function normally returns non-decreasing values, it can return a
lower value than a previous call if the system clock has been set back between
the two calls.
timezone~
The offset of the local (non-DST) timezone, in seconds west of UTC (negative in
most of Western Europe, positive in the US, zero in the UK).
tzname~
A tuple of two strings: the first is the name of the local non-DST timezone, the
second is the name of the local DST timezone. If no DST timezone is defined,
the second string should not be used.
tzset()~
Resets the time conversion rules used by the library routines. The environment
variable TZ specifies how this is done.
.. versionadded:: 2.3
Availability: Unix.
.. note:: >
Although in many cases, changing the TZ environment variable may
affect the output of functions like localtime without calling
tzset, this behavior should not be relied on.
The TZ environment variable should contain no whitespace.
<
The standard format of the TZ environment variable is (whitespace
added for clarity):: >
std offset [dst [offset [,start[/time], end[/time]]]]
<
Where the components are:
``std`` and ``dst``
Three or more alphanumerics giving the timezone abbreviations. These will be
propagated into time.tzname
``offset``
The offset has the form: ``± hh[:mm[:ss]]``. This indicates the value
added the local time to arrive at UTC. If preceded by a '-', the timezone
is east of the Prime Meridian; otherwise, it is west. If no offset follows
dst, summer time is assumed to be one hour ahead of standard time.
``start[/time], end[/time]``
Indicates when to change to and back from DST. The format of the
start and end dates are one of the following:
J{n}
The Julian day {n} (1 <= {n} <= 365). Leap days are not counted, so in
all years February 28 is day 59 and March 1 is day 60.
{n}
The zero-based Julian day (0 <= {n} <= 365). Leap days are counted, and
it is possible to refer to February 29.
M{m}.{n}.{d}
The {d}'th day (0 <= {d} <= 6) or week {n} of month {m} of the year (1
<= {n} <= 5, 1 <= {m} <= 12, where week 5 means "the last {d} day in
month {m}" which may occur in either the fourth or the fifth
week). Week 1 is the first week in which the {d}'th day occurs. Day
zero is Sunday.
``time`` has the same format as ``offset`` except that no leading sign
('-' or '+') is allowed. The default, if time is not given, is 02:00:00.
:: >
>>> os.environ['TZ'] = 'EST+05EDT,M4.1.0,M10.5.0'
>>> time.tzset()
>>> time.strftime('%X %x %Z')
'02:07:36 05/08/03 EDT'
>>> os.environ['TZ'] = 'AEST-10AEDT-11,M10.5.0,M3.5.0'
>>> time.tzset()
>>> time.strftime('%X %x %Z')
'16:08:12 05/08/03 AEST'
<
On many Unix systems (including \*BSD, Linux, Solaris, and Darwin), it is more
convenient to use the system's zoneinfo (tzfile(5)) database to
specify the timezone rules. To do this, set the TZ environment
variable to the path of the required timezone datafile, relative to the root of
the systems 'zoneinfo' timezone database, usually located at
/usr/share/zoneinfo. For example, ``'US/Eastern'``,
``'Australia/Melbourne'``, ``'Egypt'`` or ``'Europe/Amsterdam'``. :: >
>>> os.environ['TZ'] = 'US/Eastern'
>>> time.tzset()
>>> time.tzname
('EST', 'EDT')
>>> os.environ['TZ'] = 'Egypt'
>>> time.tzset()
>>> time.tzname
('EET', 'EEST')
<
.. seealso::
Module datetime (|py2stdlib-datetime|)
More object-oriented interface to dates and times.
Module locale (|py2stdlib-locale|)
Internationalization services. The locale settings can affect the return values
for some of the functions in the time (|py2stdlib-time|) module.
Module calendar (|py2stdlib-calendar|)
General calendar-related functions. timegm is the inverse of
gmtime from this module.
.. rubric:: Footnotes
.. [#] The use of ``%Z`` is now deprecated, but the ``%z`` escape that expands to the
preferred hour/minute offset is not supported by all ANSI C libraries. Also, a
strict reading of the original 1982 822 standard calls for a two-digit
year (%y rather than %Y), but practice moved to 4-digit years long before the
year 2000. The 4-digit year has been mandated by 2822, which obsoletes
822.
==============================================================================
*py2stdlib-timeit*
timeit~
:synopsis: Measure the execution time of small code snippets.
.. versionadded:: 2.3
.. index::
single: Benchmarking
single: Performance
This module provides a simple way to time small bits of Python code. It has both
command line as well as callable interfaces. It avoids a number of common traps
for measuring execution times. See also Tim Peters' introduction to the
"Algorithms" chapter in the Python Cookbook, published by O'Reilly.
The module defines the following public class:
Timer([stmt='pass' [, setup='pass' [, timer=<timer function>]]])~
Class for timing execution speed of small code snippets.
The constructor takes a statement to be timed, an additional statement used for
setup, and a timer function. Both statements default to ``'pass'``; the timer
function is platform-dependent (see the module doc string). {stmt} and {setup}
may also contain multiple statements separated by ``;`` or newlines, as long as
they don't contain multi-line string literals.
To measure the execution time of the first statement, use the timeit (|py2stdlib-timeit|)
method. The repeat method is a convenience to call timeit (|py2stdlib-timeit|)
multiple times and return a list of results.
.. versionchanged:: 2.6
The {stmt} and {setup} parameters can now also take objects that are callable
without arguments. This will embed calls to them in a timer function that will
then be executed by timeit (|py2stdlib-timeit|). Note that the timing overhead is a little
larger in this case because of the extra function calls.
Timer.print_exc([file=None])~
Helper to print a traceback from the timed code.
Typical use:: >
t = Timer(...) # outside the try/except
try:
t.timeit(...) # or t.repeat(...)
except:
t.print_exc()
<
The advantage over the standard traceback is that source lines in the compiled
template will be displayed. The optional {file} argument directs where the
traceback is sent; it defaults to ``sys.stderr``.
Timer.repeat([repeat=3 [, number=1000000]])~
Call timeit (|py2stdlib-timeit|) a few times.
This is a convenience function that calls the timeit (|py2stdlib-timeit|) repeatedly,
returning a list of results. The first argument specifies how many times to
call timeit (|py2stdlib-timeit|). The second argument specifies the {number} argument for
timeit (|py2stdlib-timeit|).
.. note:: >
It's tempting to calculate mean and standard deviation from the result vector
and report these. However, this is not very useful. In a typical case, the
lowest value gives a lower bound for how fast your machine can run the given
code snippet; higher values in the result vector are typically not caused by
variability in Python's speed, but by other processes interfering with your
timing accuracy. So the min of the result is probably the only number
you should be interested in. After that, you should look at the entire vector
and apply common sense rather than statistics.
<
Timer.timeit([number=1000000])~
Time {number} executions of the main statement. This executes the setup
statement once, and then returns the time it takes to execute the main statement
a number of times, measured in seconds as a float. The argument is the number
of times through the loop, defaulting to one million. The main statement, the
setup statement and the timer function to be used are passed to the constructor.
.. note:: >
By default, timeit (|py2stdlib-timeit|) temporarily turns off garbage collection
during the timing. The advantage of this approach is that it makes
independent timings more comparable. This disadvantage is that GC may be
an important component of the performance of the function being measured.
If so, GC can be re-enabled as the first statement in the {setup} string.
For example::
timeit.Timer('for i in xrange(10): oct(i)', 'gc.enable()').timeit()
<
Starting with version 2.6, the module also defines two convenience functions:
repeat(stmt[, setup[, timer[, repeat=3 [, number=1000000]]]])~
Create a Timer instance with the given statement, setup code and timer
function and run its repeat method with the given repeat count and
{number} executions.
.. versionadded:: 2.6
timeit(stmt[, setup[, timer[, number=1000000]]])~
Create a Timer instance with the given statement, setup code and timer
function and run its timeit (|py2stdlib-timeit|) method with {number} executions.
.. versionadded:: 2.6
Command Line Interface
----------------------
When called as a program from the command line, the following form is used:: >
python -m timeit [-n N] [-r N] [-s S] [-t] [-c] [-h] [statement ...]
<
where the following options are understood:
-n N/--number=N
how many times to execute 'statement'
-r N/--repeat=N
how many times to repeat the timer (default 3)
-s S/--setup=S
statement to be executed once initially (default ``'pass'``)
-t/--time
use time.time (default on all platforms but Windows)
-c/--clock
use time.clock (default on Windows)
-v/--verbose
print raw timing results; repeat for more digits precision
-h/--help
print a short usage message and exit
A multi-line statement may be given by specifying each line as a separate
statement argument; indented lines are possible by enclosing an argument in
quotes and using leading spaces. Multiple -s options are treated
similarly.
If -n is not given, a suitable number of loops is calculated by trying
successive powers of 10 until the total time is at least 0.2 seconds.
The default timer function is platform dependent. On Windows,
time.clock has microsecond granularity but time.time's
granularity is 1/60th of a second; on Unix, time.clock has 1/100th of a
second granularity and time.time is much more precise. On either
platform, the default timer functions measure wall clock time, not the CPU time.
This means that other processes running on the same computer may interfere with
the timing. The best thing to do when accurate timing is necessary is to repeat
the timing a few times and use the best time. The -r option is good
for this; the default of 3 repetitions is probably enough in most cases. On
Unix, you can use time.clock to measure CPU time.
.. note::
There is a certain baseline overhead associated with executing a pass statement.
The code here doesn't try to hide it, but you should be aware of it. The
baseline overhead can be measured by invoking the program without arguments.
The baseline overhead differs between Python versions! Also, to fairly compare
older Python versions to Python 2.3, you may want to use Python's -O
option for the older versions to avoid timing ``SET_LINENO`` instructions.
Examples
--------
Here are two example sessions (one using the command line, one using the module
interface) that compare the cost of using hasattr vs.
try/except to test for missing and present object
attributes. :: >
% timeit.py 'try:' ' str.__nonzero__' 'except AttributeError:' ' pass'
100000 loops, best of 3: 15.7 usec per loop
% timeit.py 'if hasattr(str, "__nonzero__"): pass'
100000 loops, best of 3: 4.26 usec per loop
% timeit.py 'try:' ' int.__nonzero__' 'except AttributeError:' ' pass'
1000000 loops, best of 3: 1.43 usec per loop
% timeit.py 'if hasattr(int, "__nonzero__"): pass'
100000 loops, best of 3: 2.23 usec per loop
<
::
>>> import timeit
>>> s = """\
... try:
... str.__nonzero__
... except AttributeError:
... pass
... """
>>> t = timeit.Timer(stmt=s)
>>> print "%.2f usec/pass" % (1000000 * t.timeit(number=100000)/100000)
17.09 usec/pass
>>> s = """\
... if hasattr(str, '__nonzero__'): pass
... """
>>> t = timeit.Timer(stmt=s)
>>> print "%.2f usec/pass" % (1000000 * t.timeit(number=100000)/100000)
4.85 usec/pass
>>> s = """\
... try:
... int.__nonzero__
... except AttributeError:
... pass
... """
>>> t = timeit.Timer(stmt=s)
>>> print "%.2f usec/pass" % (1000000 * t.timeit(number=100000)/100000)
1.97 usec/pass
>>> s = """\
... if hasattr(int, '__nonzero__'): pass
... """
>>> t = timeit.Timer(stmt=s)
>>> print "%.2f usec/pass" % (1000000 * t.timeit(number=100000)/100000)
3.15 usec/pass
To give the timeit (|py2stdlib-timeit|) module access to functions you define, you can pass a
``setup`` parameter which contains an import statement:: >
def test():
"Stupid test function"
L = []
for i in range(100):
L.append(i)
if __name__=='__main__':
from timeit import Timer
t = Timer("test()", "from __main__ import test")
print t.timeit()
==============================================================================
*py2stdlib-tix*
Tix~
:synopsis: Tk Extension Widgets for Tkinter
.. index:: single: Tix
The Tix (|py2stdlib-tix|) (Tk Interface Extension) module provides an additional rich set
of widgets. Although the standard Tk library has many useful widgets, they are
far from complete. The Tix (|py2stdlib-tix|) library provides most of the commonly needed
widgets that are missing from standard Tk: HList, ComboBox,
Control (a.k.a. SpinBox) and an assortment of scrollable widgets.
Tix (|py2stdlib-tix|) also includes many more widgets that are generally useful in a wide
range of applications: NoteBook, FileEntry,
PanedWindow, etc; there are more than 40 of them.
With all these new widgets, you can introduce new interaction techniques into
applications, creating more useful and more intuitive user interfaces. You can
design your application by choosing the most appropriate widgets to match the
special needs of your application and users.
.. note::
Tix (|py2stdlib-tix|) has been renamed to tkinter.tix in Python 3.0. The
2to3 tool will automatically adapt imports when converting your
sources to 3.0.
.. seealso::
`Tix Homepage <http://tix.sourceforge.net/>`_
The home page for Tix (|py2stdlib-tix|). This includes links to additional documentation
and downloads.
`Tix Man Pages <http://tix.sourceforge.net/dist/current/man/>`_
On-line version of the man pages and reference material.
`Tix Programming Guide <http://tix.sourceforge.net/dist/current/docs/tix-book/tix.book.html>`_
On-line version of the programmer's reference material.
`Tix Development Applications <http://tix.sourceforge.net/Tixapps/src/Tide.html>`_
Tix applications for development of Tix and Tkinter programs. Tide applications
work under Tk or Tkinter, and include TixInspect, an inspector to
remotely modify and debug Tix/Tk/Tkinter applications.
Using Tix
---------
Tix(screenName[, baseName[, className]])~
Toplevel widget of Tix which represents mostly the main window of an
application. It has an associated Tcl interpreter.
Classes in the Tix (|py2stdlib-tix|) module subclasses the classes in the Tkinter (|py2stdlib-tkinter|)
module. The former imports the latter, so to use Tix (|py2stdlib-tix|) with Tkinter, all
you need to do is to import one module. In general, you can just import
Tix (|py2stdlib-tix|), and replace the toplevel call to Tkinter.Tk with
Tix.Tk:: >
import Tix
from Tkconstants import *
root = Tix.Tk()
<
To use Tix (|py2stdlib-tix|), you must have the Tix (|py2stdlib-tix|) widgets installed, usually
alongside your installation of the Tk widgets. To test your installation, try
the following:: >
import Tix
root = Tix.Tk()
root.tk.eval('package require Tix')
<
If this fails, you have a Tk installation problem which must be resolved before
proceeding. Use the environment variable TIX_LIBRARY to point to the
installed Tix (|py2stdlib-tix|) library directory, and make sure you have the dynamic
object library (tix8183.dll or libtix8183.so) in the same
directory that contains your Tk dynamic object library (tk8183.dll or
libtk8183.so). The directory with the dynamic object library should also
have a file called pkgIndex.tcl (case sensitive), which contains the
line:: >
package ifneeded Tix 8.1 [list load "[file join $dir tix8183.dll]" Tix]
<
Tix Widgets
`Tix <http://tix.sourceforge.net/dist/current/man/html/TixCmd/TixIntro.htm>`_
introduces over 40 widget classes to the Tkinter (|py2stdlib-tkinter|) repertoire. There is a
demo of all the Tix (|py2stdlib-tix|) widgets in the Demo/tix directory of the
standard distribution.
.. The Python sample code is still being added to Python, hence commented out
Basic Widgets
^^^^^^^^^^^^^
Balloon()~
A `Balloon
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixBalloon.htm>`_ that
pops up over a widget to provide help. When the user moves the cursor inside a
widget to which a Balloon widget has been bound, a small pop-up window with a
descriptive message will be shown on the screen.
.. Python Demo of:
.. \ulink{Balloon}{http://tix.sourceforge.net/dist/current/demos/samples/Balloon.tcl}
ButtonBox()~
The `ButtonBox
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixButtonBox.htm>`_
widget creates a box of buttons, such as is commonly used for ``Ok Cancel``.
.. Python Demo of:
.. \ulink{ButtonBox}{http://tix.sourceforge.net/dist/current/demos/samples/BtnBox.tcl}
ComboBox()~
The `ComboBox
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixComboBox.htm>`_
widget is similar to the combo box control in MS Windows. The user can select a
choice by either typing in the entry subwdget or selecting from the listbox
subwidget.
.. Python Demo of:
.. \ulink{ComboBox}{http://tix.sourceforge.net/dist/current/demos/samples/ComboBox.tcl}
Control()~
The `Control
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixControl.htm>`_
widget is also known as the SpinBox widget. The user can adjust the
value by pressing the two arrow buttons or by entering the value directly into
the entry. The new value will be checked against the user-defined upper and
lower limits.
.. Python Demo of:
.. \ulink{Control}{http://tix.sourceforge.net/dist/current/demos/samples/Control.tcl}
LabelEntry()~
The `LabelEntry
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixLabelEntry.htm>`_
widget packages an entry widget and a label into one mega widget. It can be used
be used to simplify the creation of "entry-form" type of interface.
.. Python Demo of:
.. \ulink{LabelEntry}{http://tix.sourceforge.net/dist/current/demos/samples/LabEntry.tcl}
LabelFrame()~
The `LabelFrame
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixLabelFrame.htm>`_
widget packages a frame widget and a label into one mega widget. To create
widgets inside a LabelFrame widget, one creates the new widgets relative to the
frame subwidget and manage them inside the frame subwidget.
.. Python Demo of:
.. \ulink{LabelFrame}{http://tix.sourceforge.net/dist/current/demos/samples/LabFrame.tcl}
Meter()~
The `Meter
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixMeter.htm>`_ widget
can be used to show the progress of a background job which may take a long time
to execute.
.. Python Demo of:
.. \ulink{Meter}{http://tix.sourceforge.net/dist/current/demos/samples/Meter.tcl}
OptionMenu()~
The `OptionMenu
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixOptionMenu.htm>`_
creates a menu button of options.
.. Python Demo of:
.. \ulink{OptionMenu}{http://tix.sourceforge.net/dist/current/demos/samples/OptMenu.tcl}
PopupMenu()~
The `PopupMenu
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixPopupMenu.htm>`_
widget can be used as a replacement of the ``tk_popup`` command. The advantage
of the Tix (|py2stdlib-tix|) PopupMenu widget is it requires less application code
to manipulate.
.. Python Demo of:
.. \ulink{PopupMenu}{http://tix.sourceforge.net/dist/current/demos/samples/PopMenu.tcl}
Select()~
The `Select
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixSelect.htm>`_ widget
is a container of button subwidgets. It can be used to provide radio-box or
check-box style of selection options for the user.
.. Python Demo of:
.. \ulink{Select}{http://tix.sourceforge.net/dist/current/demos/samples/Select.tcl}
StdButtonBox()~
The `StdButtonBox
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixStdButtonBox.htm>`_
widget is a group of standard buttons for Motif-like dialog boxes.
.. Python Demo of:
.. \ulink{StdButtonBox}{http://tix.sourceforge.net/dist/current/demos/samples/StdBBox.tcl}
File Selectors
^^^^^^^^^^^^^^
DirList()~
The `DirList
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixDirList.htm>`_
widget displays a list view of a directory, its previous directories and its
sub-directories. The user can choose one of the directories displayed in the
list or change to another directory.
.. Python Demo of:
.. \ulink{DirList}{http://tix.sourceforge.net/dist/current/demos/samples/DirList.tcl}
DirTree()~
The `DirTree
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixDirTree.htm>`_
widget displays a tree view of a directory, its previous directories and its
sub-directories. The user can choose one of the directories displayed in the
list or change to another directory.
.. Python Demo of:
.. \ulink{DirTree}{http://tix.sourceforge.net/dist/current/demos/samples/DirTree.tcl}
DirSelectDialog()~
The `DirSelectDialog
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixDirSelectDialog.htm>`_
widget presents the directories in the file system in a dialog window. The user
can use this dialog window to navigate through the file system to select the
desired directory.
.. Python Demo of:
.. \ulink{DirSelectDialog}{http://tix.sourceforge.net/dist/current/demos/samples/DirDlg.tcl}
DirSelectBox()~
The DirSelectBox is similar to the standard Motif(TM)
directory-selection box. It is generally used for the user to choose a
directory. DirSelectBox stores the directories mostly recently selected into
a ComboBox widget so that they can be quickly selected again.
ExFileSelectBox()~
The `ExFileSelectBox
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixExFileSelectBox.htm>`_
widget is usually embedded in a tixExFileSelectDialog widget. It provides an
convenient method for the user to select files. The style of the
ExFileSelectBox widget is very similar to the standard file dialog on
MS Windows 3.1.
.. Python Demo of:
.. \ulink{ExFileSelectDialog}{http://tix.sourceforge.net/dist/current/demos/samples/EFileDlg.tcl}
FileSelectBox()~
The `FileSelectBox
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixFileSelectBox.htm>`_
is similar to the standard Motif(TM) file-selection box. It is generally used
for the user to choose a file. FileSelectBox stores the files mostly recently
selected into a ComboBox widget so that they can be quickly selected
again.
.. Python Demo of:
.. \ulink{FileSelectDialog}{http://tix.sourceforge.net/dist/current/demos/samples/FileDlg.tcl}
FileEntry()~
The `FileEntry
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixFileEntry.htm>`_
widget can be used to input a filename. The user can type in the filename
manually. Alternatively, the user can press the button widget that sits next to
the entry, which will bring up a file selection dialog.
.. Python Demo of:
.. \ulink{FileEntry}{http://tix.sourceforge.net/dist/current/demos/samples/FileEnt.tcl}
Hierarchical ListBox
^^^^^^^^^^^^^^^^^^^^
HList()~
The `HList
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixHList.htm>`_ widget
can be used to display any data that have a hierarchical structure, for example,
file system directory trees. The list entries are indented and connected by
branch lines according to their places in the hierarchy.
.. Python Demo of:
.. \ulink{HList}{http://tix.sourceforge.net/dist/current/demos/samples/HList1.tcl}
CheckList()~
The `CheckList
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixCheckList.htm>`_
widget displays a list of items to be selected by the user. CheckList acts
similarly to the Tk checkbutton or radiobutton widgets, except it is capable of
handling many more items than checkbuttons or radiobuttons.
.. Python Demo of:
.. \ulink{ CheckList}{http://tix.sourceforge.net/dist/current/demos/samples/ChkList.tcl}
.. Python Demo of:
.. \ulink{ScrolledHList (1)}{http://tix.sourceforge.net/dist/current/demos/samples/SHList.tcl}
.. Python Demo of:
.. \ulink{ScrolledHList (2)}{http://tix.sourceforge.net/dist/current/demos/samples/SHList2.tcl}
Tree()~
The `Tree
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixTree.htm>`_ widget
can be used to display hierarchical data in a tree form. The user can adjust the
view of the tree by opening or closing parts of the tree.
.. Python Demo of:
.. \ulink{Tree}{http://tix.sourceforge.net/dist/current/demos/samples/Tree.tcl}
.. Python Demo of:
.. \ulink{Tree (Dynamic)}{http://tix.sourceforge.net/dist/current/demos/samples/DynTree.tcl}
Tabular ListBox
^^^^^^^^^^^^^^^
TList()~
The `TList
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixTList.htm>`_ widget
can be used to display data in a tabular format. The list entries of a
TList widget are similar to the entries in the Tk listbox widget. The
main differences are (1) the TList widget can display the list entries
in a two dimensional format and (2) you can use graphical images as well as
multiple colors and fonts for the list entries.
.. Python Demo of:
.. \ulink{ScrolledTList (1)}{http://tix.sourceforge.net/dist/current/demos/samples/STList1.tcl}
.. Python Demo of:
.. \ulink{ScrolledTList (2)}{http://tix.sourceforge.net/dist/current/demos/samples/STList2.tcl}
.. Grid has yet to be added to Python
.. \subsubsection{Grid Widget}
.. Python Demo of:
.. \ulink{Simple Grid}{http://tix.sourceforge.net/dist/current/demos/samples/SGrid0.tcl}
.. Python Demo of:
.. \ulink{ScrolledGrid}{http://tix.sourceforge.net/dist/current/demos/samples/SGrid1.tcl}
.. Python Demo of:
.. \ulink{Editable Grid}{http://tix.sourceforge.net/dist/current/demos/samples/EditGrid.tcl}
Manager Widgets
^^^^^^^^^^^^^^^
PanedWindow()~
The `PanedWindow
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixPanedWindow.htm>`_
widget allows the user to interactively manipulate the sizes of several panes.
The panes can be arranged either vertically or horizontally. The user changes
the sizes of the panes by dragging the resize handle between two panes.
.. Python Demo of:
.. \ulink{PanedWindow}{http://tix.sourceforge.net/dist/current/demos/samples/PanedWin.tcl}
ListNoteBook()~
The `ListNoteBook
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixListNoteBook.htm>`_
widget is very similar to the TixNoteBook widget: it can be used to
display many windows in a limited space using a notebook metaphor. The notebook
is divided into a stack of pages (windows). At one time only one of these pages
can be shown. The user can navigate through these pages by choosing the name of
the desired page in the hlist subwidget.
.. Python Demo of:
.. \ulink{ListNoteBook}{http://tix.sourceforge.net/dist/current/demos/samples/ListNBK.tcl}
NoteBook()~
The `NoteBook
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixNoteBook.htm>`_
widget can be used to display many windows in a limited space using a notebook
metaphor. The notebook is divided into a stack of pages. At one time only one of
these pages can be shown. The user can navigate through these pages by choosing
the visual "tabs" at the top of the NoteBook widget.
.. Python Demo of:
.. \ulink{NoteBook}{http://tix.sourceforge.net/dist/current/demos/samples/NoteBook.tcl}
.. \subsubsection{Scrolled Widgets}
.. Python Demo of:
.. \ulink{ScrolledListBox}{http://tix.sourceforge.net/dist/current/demos/samples/SListBox.tcl}
.. Python Demo of:
.. \ulink{ScrolledText}{http://tix.sourceforge.net/dist/current/demos/samples/SText.tcl}
.. Python Demo of:
.. \ulink{ScrolledWindow}{http://tix.sourceforge.net/dist/current/demos/samples/SWindow.tcl}
.. Python Demo of:
.. \ulink{Canvas Object View}{http://tix.sourceforge.net/dist/current/demos/samples/CObjView.tcl}
Image Types
^^^^^^^^^^^
The Tix (|py2stdlib-tix|) module adds:
* `pixmap <http://tix.sourceforge.net/dist/current/man/html/TixCmd/pixmap.htm>`_
capabilities to all Tix (|py2stdlib-tix|) and Tkinter (|py2stdlib-tkinter|) widgets to create color images
from XPM files.
.. Python Demo of:
.. \ulink{XPM Image In Button}{http://tix.sourceforge.net/dist/current/demos/samples/Xpm.tcl}
.. Python Demo of:
.. \ulink{XPM Image In Menu}{http://tix.sourceforge.net/dist/current/demos/samples/Xpm1.tcl}
* `Compound
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/compound.htm>`_ image
types can be used to create images that consists of multiple horizontal lines;
each line is composed of a series of items (texts, bitmaps, images or spaces)
arranged from left to right. For example, a compound image can be used to
display a bitmap and a text string simultaneously in a Tk Button
widget.
.. Python Demo of:
.. \ulink{Compound Image In Buttons}{http://tix.sourceforge.net/dist/current/demos/samples/CmpImg.tcl}
.. Python Demo of:
.. \ulink{Compound Image In NoteBook}{http://tix.sourceforge.net/dist/current/demos/samples/CmpImg2.tcl}
.. Python Demo of:
.. \ulink{Compound Image Notebook Color Tabs}{http://tix.sourceforge.net/dist/current/demos/samples/CmpImg4.tcl}
.. Python Demo of:
.. \ulink{Compound Image Icons}{http://tix.sourceforge.net/dist/current/demos/samples/CmpImg3.tcl}
Miscellaneous Widgets
^^^^^^^^^^^^^^^^^^^^^
InputOnly()~
The `InputOnly
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixInputOnly.htm>`_
widgets are to accept inputs from the user, which can be done with the ``bind``
command (Unix only).
Form Geometry Manager
^^^^^^^^^^^^^^^^^^^^^
In addition, Tix (|py2stdlib-tix|) augments Tkinter (|py2stdlib-tkinter|) by providing:
Form()~
The `Form
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixForm.htm>`_ geometry
manager based on attachment rules for all Tk widgets.
Tix Commands
------------
tixCommand()~
The `tix commands
<http://tix.sourceforge.net/dist/current/man/html/TixCmd/tix.htm>`_ provide
access to miscellaneous elements of Tix (|py2stdlib-tix|)'s internal state and the
Tix (|py2stdlib-tix|) application context. Most of the information manipulated by these
methods pertains to the application as a whole, or to a screen or display,
rather than to a particular window.
To view the current settings, the common usage is:: >
import Tix
root = Tix.Tk()
print root.tix_configure()
<
tixCommand.tix_configure([cnf,] {}kw)~
Query or modify the configuration options of the Tix application context. If no
option is specified, returns a dictionary all of the available options. If
option is specified with no value, then the method returns a list describing the
one named option (this list will be identical to the corresponding sublist of
the value returned if no option is specified). If one or more option-value
pairs are specified, then the method modifies the given option(s) to have the
given value(s); in this case the method returns an empty string. Option may be
any of the configuration options.
tixCommand.tix_cget(option)~
Returns the current value of the configuration option given by {option}. Option
may be any of the configuration options.
tixCommand.tix_getbitmap(name)~
Locates a bitmap file of the name ``name.xpm`` or ``name`` in one of the bitmap
directories (see the tix_addbitmapdir method). By using
tix_getbitmap, you can avoid hard coding the pathnames of the bitmap
files in your application. When successful, it returns the complete pathname of
the bitmap file, prefixed with the character ``@``. The returned value can be
used to configure the ``bitmap`` option of the Tk and Tix widgets.
tixCommand.tix_addbitmapdir(directory)~
Tix maintains a list of directories under which the tix_getimage and
tix_getbitmap methods will search for image files. The standard bitmap
directory is $TIX_LIBRARY/bitmaps. The tix_addbitmapdir method
adds {directory} into this list. By using this method, the image files of an
applications can also be located using the tix_getimage or
tix_getbitmap method.
tixCommand.tix_filedialog([dlgclass])~
Returns the file selection dialog that may be shared among different calls from
this application. This method will create a file selection dialog widget when
it is called the first time. This dialog will be returned by all subsequent
calls to tix_filedialog. An optional dlgclass parameter can be passed
as a string to specified what type of file selection dialog widget is desired.
Possible options are ``tix``, ``FileSelectDialog`` or ``tixExFileSelectDialog``.
tixCommand.tix_getimage(self, name)~
Locates an image file of the name name.xpm, name.xbm or
name.ppm in one of the bitmap directories (see the
tix_addbitmapdir method above). If more than one file with the same name
(but different extensions) exist, then the image type is chosen according to the
depth of the X display: xbm images are chosen on monochrome displays and color
images are chosen on color displays. By using tix_getimage, you can
avoid hard coding the pathnames of the image files in your application. When
successful, this method returns the name of the newly created image, which can
be used to configure the ``image`` option of the Tk and Tix widgets.
tixCommand.tix_option_get(name)~
Gets the options maintained by the Tix scheme mechanism.
tixCommand.tix_resetoptions(newScheme, newFontSet[, newScmPrio])~
Resets the scheme and fontset of the Tix application to {newScheme} and
{newFontSet}, respectively. This affects only those widgets created after this
call. Therefore, it is best to call the resetoptions method before the creation
of any widgets in a Tix application.
The optional parameter {newScmPrio} can be given to reset the priority level of
the Tk options set by the Tix schemes.
Because of the way Tk handles the X option database, after Tix has been has
imported and inited, it is not possible to reset the color schemes and font sets
using the tix_config method. Instead, the tix_resetoptions
method must be used.
==============================================================================
*py2stdlib-tkinter*
Tkinter~
:synopsis: Interface to Tcl/Tk for graphical user interfaces
The Tkinter (|py2stdlib-tkinter|) module ("Tk interface") is the standard Python interface to
the Tk GUI toolkit. Both Tk and Tkinter (|py2stdlib-tkinter|) are available on most Unix
platforms, as well as on Windows systems. (Tk itself is not part of Python; it
is maintained at ActiveState.)
.. note::
Tkinter (|py2stdlib-tkinter|) has been renamed to tkinter in Python 3.0. The
2to3 tool will automatically adapt imports when converting your
sources to 3.0.
.. seealso::
`Python Tkinter Resources <http://www.python.org/topics/tkinter/>`_
The Python Tkinter Topic Guide provides a great deal of information on using Tk
from Python and links to other sources of information on Tk.
`An Introduction to Tkinter <http://www.pythonware.com/library/an-introduction-to-tkinter.htm>`_
Fredrik Lundh's on-line reference material.
`Tkinter reference: a GUI for Python <http://infohost.nmt.edu/tcc/help/pubs/lang.html>`_
On-line reference material.
`Python and Tkinter Programming <http://www.amazon.com/exec/obidos/ASIN/1884777813>`_
The book by John Grayson (ISBN 1-884777-81-3).
Tkinter Modules
---------------
Most of the time, the Tkinter (|py2stdlib-tkinter|) module is all you really need, but a number
of additional modules are available as well. The Tk interface is located in a
binary module named _tkinter. This module contains the low-level
interface to Tk, and should never be used directly by application programmers.
It is usually a shared library (or DLL), but might in some cases be statically
linked with the Python interpreter.
In addition to the Tk interface module, Tkinter (|py2stdlib-tkinter|) includes a number of
Python modules. The two most important modules are the Tkinter (|py2stdlib-tkinter|) module
itself, and a module called Tkconstants. The former automatically imports
the latter, so to use Tkinter, all you need to do is to import one module:: >
import Tkinter
<
Or, more often::
from Tkinter import *
Tk(screenName=None, baseName=None, className='Tk', useTk=1)~
The Tk class is instantiated without arguments. This creates a toplevel
widget of Tk which usually is the main window of an application. Each instance
has its own associated Tcl interpreter.
.. FIXME: The following keyword arguments are currently recognized:
.. versionchanged:: 2.4
The {useTk} parameter was added.
Tcl(screenName=None, baseName=None, className='Tk', useTk=0)~
The Tcl function is a factory function which creates an object much like
that created by the Tk class, except that it does not initialize the Tk
subsystem. This is most often useful when driving the Tcl interpreter in an
environment where one doesn't want to create extraneous toplevel windows, or
where one cannot (such as Unix/Linux systems without an X server). An object
created by the Tcl object can have a Toplevel window created (and the Tk
subsystem initialized) by calling its loadtk method.
.. versionadded:: 2.4
Other modules that provide Tk support include:
Text widget with a vertical scroll bar built in.
Dialog to let the user choose a color.
Base class for the dialogs defined in the other modules listed here.
Common dialogs to allow the user to specify a file to open or save.
Utilities to help work with fonts.
Access to standard Tk dialog boxes.
Basic dialogs and convenience functions.
Drag-and-drop support for Tkinter (|py2stdlib-tkinter|). This is experimental and should become
deprecated when it is replaced with the Tk DND.
Turtle graphics in a Tk window.
These have been renamed as well in Python 3.0; they were all made submodules of
the new ``tkinter`` package.
Tkinter Life Preserver
----------------------
This section is not designed to be an exhaustive tutorial on either Tk or
Tkinter. Rather, it is intended as a stop gap, providing some introductory
orientation on the system.
Credits:
* Tkinter was written by Steen Lumholt and Guido van Rossum.
* Tk was written by John Ousterhout while at Berkeley.
* This Life Preserver was written by Matt Conway at the University of Virginia.
* The html rendering, and some liberal editing, was produced from a FrameMaker
version by Ken Manheimer.
* Fredrik Lundh elaborated and revised the class interface descriptions, to get
them current with Tk 4.2.
* Mike Clarkson converted the documentation to LaTeX, and compiled the User
Interface chapter of the reference manual.
How To Use This Section
^^^^^^^^^^^^^^^^^^^^^^^
This section is designed in two parts: the first half (roughly) covers
background material, while the second half can be taken to the keyboard as a
handy reference.
When trying to answer questions of the form "how do I do blah", it is often best
to find out how to do"blah" in straight Tk, and then convert this back into the
corresponding Tkinter (|py2stdlib-tkinter|) call. Python programmers can often guess at the
correct Python command by looking at the Tk documentation. This means that in
order to use Tkinter, you will have to know a little bit about Tk. This document
can't fulfill that role, so the best we can do is point you to the best
documentation that exists. Here are some hints:
* The authors strongly suggest getting a copy of the Tk man pages. Specifically,
the man pages in the ``mann`` directory are most useful. The ``man3`` man pages
describe the C interface to the Tk library and thus are not especially helpful
for script writers.
* Addison-Wesley publishes a book called Tcl and the Tk Toolkit by John
Ousterhout (ISBN 0-201-63337-X) which is a good introduction to Tcl and Tk for
the novice. The book is not exhaustive, and for many details it defers to the
man pages.
* Tkinter.py is a last resort for most, but can be a good place to go
when nothing else makes sense.
.. seealso::
`ActiveState Tcl Home Page <http://tcl.activestate.com/>`_
The Tk/Tcl development is largely taking place at ActiveState.
`Tcl and the Tk Toolkit <http://www.amazon.com/exec/obidos/ASIN/020163337X>`_
The book by John Ousterhout, the inventor of Tcl .
`Practical Programming in Tcl and Tk <http://www.amazon.com/exec/obidos/ASIN/0130220280>`_
Brent Welch's encyclopedic book.
A Simple Hello World Program
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
:: >
from Tkinter import *
class Application(Frame):
def say_hi(self):
print "hi there, everyone!"
def createWidgets(self):
self.QUIT = Button(self)
self.QUIT["text"] = "QUIT"
self.QUIT["fg"] = "red"
self.QUIT["command"] = self.quit
self.QUIT.pack({"side": "left"})
self.hi_there = Button(self)
self.hi_there["text"] = "Hello",
self.hi_there["command"] = self.say_hi
self.hi_there.pack({"side": "left"})
def __init__(self, master=None):
Frame.__init__(self, master)
self.pack()
self.createWidgets()
root = Tk()
app = Application(master=root)
app.mainloop()
root.destroy()
<
A (Very) Quick Look at Tcl/Tk
The class hierarchy looks complicated, but in actual practice, application
programmers almost always refer to the classes at the very bottom of the
hierarchy.
Notes:
* These classes are provided for the purposes of organizing certain functions
under one namespace. They aren't meant to be instantiated independently.
* The Tk class is meant to be instantiated only once in an application.
Application programmers need not instantiate one explicitly, the system creates
one whenever any of the other classes are instantiated.
* The Widget class is not meant to be instantiated, it is meant only
for subclassing to make "real" widgets (in C++, this is called an 'abstract
class').
To make use of this reference material, there will be times when you will need
to know how to read short passages of Tk and how to identify the various parts
of a Tk command. (See section tkinter-basic-mapping for the
Tkinter (|py2stdlib-tkinter|) equivalents of what's below.)
Tk scripts are Tcl programs. Like all Tcl programs, Tk scripts are just lists
of tokens separated by spaces. A Tk widget is just its {class}, the {options}
that help configure it, and the {actions} that make it do useful things.
To make a widget in Tk, the command is always of the form:: >
classCommand newPathname options
<
{classCommand}
denotes which kind of widget to make (a button, a label, a menu...)
{newPathname}
is the new name for this widget. All names in Tk must be unique. To help
enforce this, widgets in Tk are named with {pathnames}, just like files in a
file system. The top level widget, the {root}, is called ``.`` (period) and
children are delimited by more periods. For example,
``.myApp.controlPanel.okButton`` might be the name of a widget.
{options}
configure the widget's appearance and in some cases, its behavior. The options
come in the form of a list of flags and values. Flags are preceded by a '-',
like Unix shell command flags, and values are put in quotes if they are more
than one word.
For example:: >
button .fred -fg red -text "hi there"
^ ^ \_____________________/
| | |
class new options
command widget (-opt val -opt val ...)
<
Once created, the pathname to the widget becomes a new command. This new
{widget command} is the programmer's handle for getting the new widget to
perform some {action}. In C, you'd express this as someAction(fred,
someOptions), in C++, you would express this as fred.someAction(someOptions),
and in Tk, you say:: >
.fred someAction someOptions
<
Note that the object name, ``.fred``, starts with a dot.
As you'd expect, the legal values for {someAction} will depend on the widget's
class: ``.fred disable`` works if fred is a button (fred gets greyed out), but
does not work if fred is a label (disabling of labels is not supported in Tk).
The legal values of {someOptions} is action dependent. Some actions, like
``disable``, require no arguments, others, like a text-entry box's ``delete``
command, would need arguments to specify what range of text to delete.
Mapping Basic Tk into Tkinter
-----------------------------
Class commands in Tk correspond to class constructors in Tkinter. :: >
button .fred =====> fred = Button()
<
The master of an object is implicit in the new name given to it at creation
time. In Tkinter, masters are specified explicitly. :: >
button .panel.fred =====> fred = Button(panel)
<
The configuration options in Tk are given in lists of hyphened tags followed by
values. In Tkinter, options are specified as keyword-arguments in the instance
constructor, and keyword-args for configure calls or as instance indices, in
dictionary style, for established instances. See section
tkinter-setting-options on setting options. :: >
button .fred -fg red =====> fred = Button(panel, fg = "red")
.fred configure -fg red =====> fred["fg"] = red
OR ==> fred.config(fg = "red")
<
In Tk, to perform an action on a widget, use the widget name as a command, and
follow it with an action name, possibly with arguments (options). In Tkinter,
you call methods on the class instance to invoke actions on the widget. The
actions (methods) that a given widget can perform are listed in the Tkinter.py
module. :: >
.fred invoke =====> fred.invoke()
<
To give a widget to the packer (geometry manager), you call pack with optional
arguments. In Tkinter, the Pack class holds all this functionality, and the
various forms of the pack command are implemented as methods. All widgets in
Tkinter (|py2stdlib-tkinter|) are subclassed from the Packer, and so inherit all the packing
methods. See the Tix (|py2stdlib-tix|) module documentation for additional information on
the Form geometry manager. :: >
pack .fred -side left =====> fred.pack(side = "left")
<
How Tk and Tkinter are Related
From the top down:
Your App Here (Python)
A Python application makes a Tkinter (|py2stdlib-tkinter|) call.
Tkinter (Python Module)
This call (say, for example, creating a button widget), is implemented in the
{Tkinter} module, which is written in Python. This Python function will parse
the commands and the arguments and convert them into a form that makes them look
as if they had come from a Tk script instead of a Python script.
tkinter (C)
These commands and their arguments will be passed to a C function in the
{tkinter} - note the lowercase - extension module.
Tk Widgets (C and Tcl)
This C function is able to make calls into other C modules, including the C
functions that make up the Tk library. Tk is implemented in C and some Tcl.
The Tcl part of the Tk widgets is used to bind certain default behaviors to
widgets, and is executed once at the point where the Python Tkinter (|py2stdlib-tkinter|)
module is imported. (The user never sees this stage).
Tk (C)
The Tk part of the Tk Widgets implement the final mapping to ...
Xlib (C)
the Xlib library to draw graphics on the screen.
Handy Reference
---------------
Setting Options
^^^^^^^^^^^^^^^
Options control things like the color and border width of a widget. Options can
be set in three ways:
At object creation time, using keyword arguments
:: >
fred = Button(self, fg = "red", bg = "blue")
<
After object creation, treating the option name like a dictionary index
:: >
fred["fg"] = "red"
fred["bg"] = "blue"
<
Use the config() method to update multiple attrs subsequent to object creation
:: >
fred.config(fg = "red", bg = "blue")
<
For a complete explanation of a given option and its behavior, see the Tk man
pages for the widget in question.
Note that the man pages list "STANDARD OPTIONS" and "WIDGET SPECIFIC OPTIONS"
for each widget. The former is a list of options that are common to many
widgets, the latter are the options that are idiosyncratic to that particular
widget. The Standard Options are documented on the options(3) man
page.
No distinction between standard and widget-specific options is made in this
document. Some options don't apply to some kinds of widgets. Whether a given
widget responds to a particular option depends on the class of the widget;
buttons have a ``command`` option, labels do not.
The options supported by a given widget are listed in that widget's man page, or
can be queried at runtime by calling the config method without
arguments, or by calling the keys method on that widget. The return
value of these calls is a dictionary whose key is the name of the option as a
string (for example, ``'relief'``) and whose values are 5-tuples.
Some options, like ``bg`` are synonyms for common options with long names
(``bg`` is shorthand for "background"). Passing the ``config()`` method the name
of a shorthand option will return a 2-tuple, not 5-tuple. The 2-tuple passed
back will contain the name of the synonym and the "real" option (such as
``('bg', 'background')``).
+-------+---------------------------------+--------------+
| Index | Meaning | Example |
+=======+=================================+==============+
| 0 | option name | ``'relief'`` |
+-------+---------------------------------+--------------+
| 1 | option name for database lookup | ``'relief'`` |
+-------+---------------------------------+--------------+
| 2 | option class for database | ``'Relief'`` |
| | lookup | |
+-------+---------------------------------+--------------+
| 3 | default value | ``'raised'`` |
+-------+---------------------------------+--------------+
| 4 | current value | ``'groove'`` |
+-------+---------------------------------+--------------+
Example:: >
>>> print fred.config()
{'relief' : ('relief', 'relief', 'Relief', 'raised', 'groove')}
<
Of course, the dictionary printed will include all the options available and
their values. This is meant only as an example.
The Packer
^^^^^^^^^^
.. index:: single: packing (widgets)
The packer is one of Tk's geometry-management mechanisms. Geometry managers
are used to specify the relative positioning of the positioning of widgets
within their container - their mutual {master}. In contrast to the more
cumbersome {placer} (which is used less commonly, and we do not cover here), the
packer takes qualitative relationship specification - {above}, {to the left of},
{filling}, etc - and works everything out to determine the exact placement
coordinates for you.
The size of any {master} widget is determined by the size of the "slave widgets"
inside. The packer is used to control where slave widgets appear inside the
master into which they are packed. You can pack widgets into frames, and frames
into other frames, in order to achieve the kind of layout you desire.
Additionally, the arrangement is dynamically adjusted to accommodate incremental
changes to the configuration, once it is packed.
Note that widgets do not appear until they have had their geometry specified
with a geometry manager. It's a common early mistake to leave out the geometry
specification, and then be surprised when the widget is created but nothing
appears. A widget will appear only after it has had, for example, the packer's
pack method applied to it.
The pack() method can be called with keyword-option/value pairs that control
where the widget is to appear within its container, and how it is to behave when
the main application window is resized. Here are some examples:: >
fred.pack() # defaults to side = "top"
fred.pack(side = "left")
fred.pack(expand = 1)
<
Packer Options
For more extensive information on the packer and the options that it can take,
see the man pages and page 183 of John Ousterhout's book.
anchor
Anchor type. Denotes where the packer is to place each slave in its parcel.
expand
Boolean, ``0`` or ``1``.
fill
Legal values: ``'x'``, ``'y'``, ``'both'``, ``'none'``.
ipadx and ipady
A distance - designating internal padding on each side of the slave widget.
padx and pady
A distance - designating external padding on each side of the slave widget.
side
Legal values are: ``'left'``, ``'right'``, ``'top'``, ``'bottom'``.
Coupling Widget Variables
^^^^^^^^^^^^^^^^^^^^^^^^^
The current-value setting of some widgets (like text entry widgets) can be
connected directly to application variables by using special options. These
options are ``variable``, ``textvariable``, ``onvalue``, ``offvalue``, and
``value``. This connection works both ways: if the variable changes for any
reason, the widget it's connected to will be updated to reflect the new value.
Unfortunately, in the current implementation of Tkinter (|py2stdlib-tkinter|) it is not
possible to hand over an arbitrary Python variable to a widget through a
``variable`` or ``textvariable`` option. The only kinds of variables for which
this works are variables that are subclassed from a class called Variable,
defined in the Tkinter (|py2stdlib-tkinter|) module.
There are many useful subclasses of Variable already defined:
StringVar, IntVar, DoubleVar, and
BooleanVar. To read the current value of such a variable, call the
get method on it, and to change its value you call the !set
method. If you follow this protocol, the widget will always track the value of
the variable, with no further intervention on your part.
For example:: >
class App(Frame):
def __init__(self, master=None):
Frame.__init__(self, master)
self.pack()
self.entrythingy = Entry()
self.entrythingy.pack()
# here is the application variable
self.contents = StringVar()
# set it to some value
self.contents.set("this is a variable")
# tell the entry widget to watch this variable
self.entrythingy["textvariable"] = self.contents
# and here we get a callback when the user hits return.
# we will have the program print out the value of the
# application variable when the user hits return
self.entrythingy.bind('<Key-Return>',
self.print_contents)
def print_contents(self, event):
print "hi. contents of entry is now ---->", \
self.contents.get()
<
The Window Manager
.. index:: single: window manager (widgets)
In Tk, there is a utility command, ``wm``, for interacting with the window
manager. Options to the ``wm`` command allow you to control things like titles,
placement, icon bitmaps, and the like. In Tkinter (|py2stdlib-tkinter|), these commands have
been implemented as methods on the Wm class. Toplevel widgets are
subclassed from the Wm class, and so can call the Wm methods
directly.
To get at the toplevel window that contains a given widget, you can often just
refer to the widget's master. Of course if the widget has been packed inside of
a frame, the master won't represent a toplevel window. To get at the toplevel
window that contains an arbitrary widget, you can call the _root method.
This method begins with an underscore to denote the fact that this function is
part of the implementation, and not an interface to Tk functionality.
Here are some examples of typical usage:: >
from Tkinter import *
class App(Frame):
def __init__(self, master=None):
Frame.__init__(self, master)
self.pack()
# create the application
myapp = App()
#
# here are method calls to the window manager class
#
myapp.master.title("My Do-Nothing Application")
myapp.master.maxsize(1000, 400)
# start the program
myapp.mainloop()
<
Tk Option Data Types
.. index:: single: Tk Option Data Types
anchor
Legal values are points of the compass: ``"n"``, ``"ne"``, ``"e"``, ``"se"``,
``"s"``, ``"sw"``, ``"w"``, ``"nw"``, and also ``"center"``.
bitmap
There are eight built-in, named bitmaps: ``'error'``, ``'gray25'``,
``'gray50'``, ``'hourglass'``, ``'info'``, ``'questhead'``, ``'question'``,
``'warning'``. To specify an X bitmap filename, give the full path to the file,
preceded with an ``@``, as in ``"@/usr/contrib/bitmap/gumby.bit"``.
boolean
You can pass integers 0 or 1 or the strings ``"yes"`` or ``"no"`` .
callback
This is any Python function that takes no arguments. For example:: >
def print_it():
print "hi there"
fred["command"] = print_it
<
color
Colors can be given as the names of X colors in the rgb.txt file, or as strings
representing RGB values in 4 bit: ``"#RGB"``, 8 bit: ``"#RRGGBB"``, 12 bit"
``"#RRRGGGBBB"``, or 16 bit ``"#RRRRGGGGBBBB"`` ranges, where R,G,B here
represent any legal hex digit. See page 160 of Ousterhout's book for details.
cursor
The standard X cursor names from cursorfont.h can be used, without the
``XC_`` prefix. For example to get a hand cursor (XC_hand2), use the
string ``"hand2"``. You can also specify a bitmap and mask file of your own.
See page 179 of Ousterhout's book.
distance
Screen distances can be specified in either pixels or absolute distances.
Pixels are given as numbers and absolute distances as strings, with the trailing
character denoting units: ``c`` for centimetres, ``i`` for inches, ``m`` for
millimetres, ``p`` for printer's points. For example, 3.5 inches is expressed
as ``"3.5i"``.
font
Tk uses a list font name format, such as ``{courier 10 bold}``. Font sizes with
positive numbers are measured in points; sizes with negative numbers are
measured in pixels.
geometry
This is a string of the form ``widthxheight``, where width and height are
measured in pixels for most widgets (in characters for widgets displaying text).
For example: ``fred["geometry"] = "200x100"``.
justify
Legal values are the strings: ``"left"``, ``"center"``, ``"right"``, and
``"fill"``.
region
This is a string with four space-delimited elements, each of which is a legal
distance (see above). For example: ``"2 3 4 5"`` and ``"3i 2i 4.5i 2i"`` and
``"3c 2c 4c 10.43c"`` are all legal regions.
relief
Determines what the border style of a widget will be. Legal values are:
``"raised"``, ``"sunken"``, ``"flat"``, ``"groove"``, and ``"ridge"``.
scrollcommand
This is almost always the !set method of some scrollbar widget, but can
be any widget method that takes a single argument. Refer to the file
Demo/tkinter/matt/canvas-with-scrollbars.py in the Python source
distribution for an example.
wrap:
Must be one of: ``"none"``, ``"char"``, or ``"word"``.
Bindings and Events
^^^^^^^^^^^^^^^^^^^
.. index::
single: bind (widgets)
single: events (widgets)
The bind method from the widget command allows you to watch for certain events
and to have a callback function trigger when that event type occurs. The form
of the bind method is:: >
def bind(self, sequence, func, add=''):
<
where:
sequence
is a string that denotes the target kind of event. (See the bind man page and
page 201 of John Ousterhout's book for details).
func
is a Python function, taking one argument, to be invoked when the event occurs.
An Event instance will be passed as the argument. (Functions deployed this way
are commonly known as {callbacks}.)
add
is optional, either ``''`` or ``'+'``. Passing an empty string denotes that
this binding is to replace any other bindings that this event is associated
with. Passing a ``'+'`` means that this function is to be added to the list
of functions bound to this event type.
For example:: >
def turnRed(self, event):
event.widget["activeforeground"] = "red"
self.button.bind("<Enter>", self.turnRed)
<
Notice how the widget field of the event is being accessed in the
turnRed callback. This field contains the widget that caught the X
event. The following table lists the other event fields you can access, and how
they are denoted in Tk, which can be useful when referring to the Tk man pages.
:: >
Tk Tkinter Event Field Tk Tkinter Event Field
-- ------------------- -- -------------------
%f focus %A char
%h height %E send_event
%k keycode %K keysym
%s state %N keysym_num
%t time %T type
%w width %W widget
%x x %X x_root
%y y %Y y_root
<
The index Parameter
A number of widgets require"index" parameters to be passed. These are used to
point at a specific place in a Text widget, or to particular characters in an
Entry widget, or to particular menu items in a Menu widget.
Entry widget indexes (index, view index, etc.)
Entry widgets have options that refer to character positions in the text being
displayed. You can use these Tkinter (|py2stdlib-tkinter|) functions to access these special
points in text widgets:
AtEnd()
refers to the last position in the text
AtInsert()
refers to the point where the text cursor is
AtSelFirst()
indicates the beginning point of the selected text
AtSelLast()
denotes the last point of the selected text and finally
At(x[, y])
refers to the character at pixel location {x}, {y} (with {y} not used in the
case of a text entry widget, which contains a single line of text).
Text widget indexes
The index notation for Text widgets is very rich and is best described in the Tk
man pages.
Menu indexes (menu.invoke(), menu.entryconfig(), etc.)
Some options and methods for menus manipulate specific menu entries. Anytime a
menu index is needed for an option or a parameter, you may pass in:
* an integer which refers to the numeric position of the entry in the widget,
counted from the top, starting with 0;
* the string ``'active'``, which refers to the menu position that is currently
under the cursor;
* the string ``"last"`` which refers to the last menu item;
* An integer preceded by ``@``, as in ``@6``, where the integer is interpreted
as a y pixel coordinate in the menu's coordinate system;
* the string ``"none"``, which indicates no menu entry at all, most often used
with menu.activate() to deactivate all entries, and finally,
* a text string that is pattern matched against the label of the menu entry, as
scanned from the top of the menu to the bottom. Note that this index type is
considered after all the others, which means that matches for menu items
labelled ``last``, ``active``, or ``none`` may be interpreted as the above
literals, instead.
Images
^^^^^^
Bitmap/Pixelmap images can be created through the subclasses of
Tkinter.Image:
* BitmapImage can be used for X11 bitmap data.
* PhotoImage can be used for GIF and PPM/PGM color bitmaps.
Either type of image is created through either the ``file`` or the ``data``
option (other options are available as well).
The image object can then be used wherever an ``image`` option is supported by
some widget (e.g. labels, buttons, menus). In these cases, Tk will not keep a
reference to the image. When the last Python reference to the image object is
deleted, the image data is deleted as well, and Tk will display an empty box
wherever the image was used.
==============================================================================
*py2stdlib-token*
token~
:synopsis: Constants representing terminal nodes of the parse tree.
This module provides constants which represent the numeric values of leaf nodes
of the parse tree (terminal tokens). Refer to the file Grammar/Grammar
in the Python distribution for the definitions of the names in the context of
the language grammar. The specific numeric values which the names map to may
change between Python versions.
This module also provides one data object and some functions. The functions
mirror definitions in the Python C header files.
tok_name~
Dictionary mapping the numeric values of the constants defined in this module
back to name strings, allowing more human-readable representation of parse trees
to be generated.
ISTERMINAL(x)~
Return true for terminal token values.
ISNONTERMINAL(x)~
Return true for non-terminal token values.
ISEOF(x)~
Return true if {x} is the marker indicating the end of input.
.. seealso::
Module parser (|py2stdlib-parser|)
The second example for the parser (|py2stdlib-parser|) module shows how to use the
symbol (|py2stdlib-symbol|) module.
==============================================================================
*py2stdlib-tokenize*
tokenize~
:synopsis: Lexical scanner for Python source code.
The tokenize (|py2stdlib-tokenize|) module provides a lexical scanner for Python source code,
implemented in Python. The scanner in this module returns comments as tokens as
well, making it useful for implementing "pretty-printers," including colorizers
for on-screen displays.
The primary entry point is a generator:
generate_tokens(readline)~
The generate_tokens generator requires one argument, {readline},
which must be a callable object which provides the same interface as the
readline (|py2stdlib-readline|) method of built-in file objects (see section
bltin-file-objects). Each call to the function should return one line
of input as a string.
The generator produces 5-tuples with these members: the token type; the token
string; a 2-tuple ``(srow, scol)`` of ints specifying the row and column
where the token begins in the source; a 2-tuple ``(erow, ecol)`` of ints
specifying the row and column where the token ends in the source; and the
line on which the token was found. The line passed (the last tuple item) is
the {logical} line; continuation lines are included.
.. versionadded:: 2.2
An older entry point is retained for backward compatibility:
tokenize(readline[, tokeneater])~
The tokenize (|py2stdlib-tokenize|) function accepts two parameters: one representing the input
stream, and one providing an output mechanism for tokenize (|py2stdlib-tokenize|).
The first parameter, {readline}, must be a callable object which provides the
same interface as the readline (|py2stdlib-readline|) method of built-in file objects (see
section bltin-file-objects). Each call to the function should return one
line of input as a string. Alternately, {readline} may be a callable object that
signals completion by raising StopIteration.
.. versionchanged:: 2.5
Added StopIteration support.
The second parameter, {tokeneater}, must also be a callable object. It is
called once for each token, with five arguments, corresponding to the tuples
generated by generate_tokens.
All constants from the token (|py2stdlib-token|) module are also exported from
tokenize (|py2stdlib-tokenize|), as are two additional token type values that might be passed to
the {tokeneater} function by tokenize (|py2stdlib-tokenize|):
COMMENT~
Token value used to indicate a comment.
NL~
Token value used to indicate a non-terminating newline. The NEWLINE token
indicates the end of a logical line of Python code; NL tokens are generated when
a logical line of code is continued over multiple physical lines.
Another function is provided to reverse the tokenization process. This is useful
for creating tools that tokenize a script, modify the token stream, and write
back the modified script.
untokenize(iterable)~
Converts tokens back into Python source code. The {iterable} must return
sequences with at least two elements, the token type and the token string. Any
additional sequence elements are ignored.
The reconstructed script is returned as a single string. The result is
guaranteed to tokenize back to match the input so that the conversion is
lossless and round-trips are assured. The guarantee applies only to the token
type and token string as the spacing between tokens (column positions) may
change.
.. versionadded:: 2.5
Example of a script re-writer that transforms float literals into Decimal
objects:: >
def decistmt(s):
"""Substitute Decimals for floats in a string of statements.
>>> from decimal import Decimal
>>> s = 'print +21.3e-5*-.1234/81.7'
>>> decistmt(s)
"print +Decimal ('21.3e-5')*-Decimal ('.1234')/Decimal ('81.7')"
>>> exec(s)
-3.21716034272e-007
>>> exec(decistmt(s))
-3.217160342717258261933904529E-7
"""
result = []
g = generate_tokens(StringIO(s).readline) # tokenize the string
for toknum, tokval, _, _, _ in g:
if toknum == NUMBER and '.' in tokval: # replace NUMBER tokens
result.extend([
(NAME, 'Decimal'),
(OP, '('),
(STRING, repr(tokval)),
(OP, ')')
])
else:
result.append((toknum, tokval))
return untokenize(result)
==============================================================================
*py2stdlib-trace*
trace~
:synopsis: Trace or track Python statement execution.
The trace (|py2stdlib-trace|) module allows you to trace program execution, generate
annotated statement coverage listings, print caller/callee relationships and
list functions executed during a program run. It can be used in another program
or from the command line.
Command Line Usage
------------------
The trace (|py2stdlib-trace|) module can be invoked from the command line. It can be as
simple as :: >
python -m trace --count somefile.py ...
<
The above will generate annotated listings of all Python modules imported during
the execution of somefile.py.
The following command-line arguments are supported:
--trace, -t
Display lines as they are executed.
--count, -c
Produce a set of annotated listing files upon program completion that shows how
many times each statement was executed.
--report, -r
Produce an annotated list from an earlier program run that used the
--count and --file arguments.
--no-report, -R
Do not generate annotated listings. This is useful if you intend to make
several runs with --count then produce a single set of annotated
listings at the end.
--listfuncs, -l
List the functions executed by running the program.
--trackcalls, -T
Generate calling relationships exposed by running the program.
--file, -f
Name a file containing (or to contain) counts.
--coverdir, -C
Name a directory in which to save annotated listing files.
--missing, -m
When generating annotated listings, mark lines which were not executed with
'``>>>>>>``'.
--summary, -s
When using --count or --report, write a brief summary to
stdout for each file processed.
--ignore-module
Accepts comma separated list of module names. Ignore each of the named
module and its submodules (if it is a package). May be given
multiple times.
--ignore-dir
Ignore all modules and packages in the named directory and subdirectories
(multiple directories can be joined by os.pathsep). May be given multiple
times.
Programming Interface
---------------------
Trace([count=1[, trace=1[, countfuncs=0[, countcallers=0[, ignoremods=()[, ignoredirs=()[, infile=None[, outfile=None[, timing=False]]]]]]]]])~
Create an object to trace execution of a single statement or expression. All
parameters are optional. {count} enables counting of line numbers. {trace}
enables line execution tracing. {countfuncs} enables listing of the functions
called during the run. {countcallers} enables call relationship tracking.
{ignoremods} is a list of modules or packages to ignore. {ignoredirs} is a list
of directories whose modules or packages should be ignored. {infile} is the
file from which to read stored count information. {outfile} is a file in which
to write updated count information. {timing} enables a timestamp relative
to when tracing was started to be displayed.
Trace.run(cmd)~
Run {cmd} under control of the Trace object with the current tracing parameters.
Trace.runctx(cmd[, globals=None[, locals=None]])~
Run {cmd} under control of the Trace object with the current tracing parameters
in the defined global and local environments. If not defined, {globals} and
{locals} default to empty dictionaries.
Trace.runfunc(func, {args, }*kwds)~
Call {func} with the given arguments under control of the Trace object
with the current tracing parameters.
This is a simple example showing the use of this module:: >
import sys
import trace
# create a Trace object, telling it what to ignore, and whether to
# do tracing or line-counting or both.
tracer = trace.Trace(
ignoredirs=[sys.prefix, sys.exec_prefix],
trace=0,
count=1)
# run the new command using the given tracer
tracer.run('main()')
# make a report, placing output in /tmp
r = tracer.results()
r.write_results(show_missing=True, coverdir="/tmp")
==============================================================================
*py2stdlib-traceback*
traceback~
:synopsis: Print or retrieve a stack traceback.
This module provides a standard interface to extract, format and print stack
traces of Python programs. It exactly mimics the behavior of the Python
interpreter when it prints a stack trace. This is useful when you want to print
stack traces under program control, such as in a "wrapper" around the
interpreter.
.. index:: object: traceback
The module uses traceback objects --- this is the object type that is stored in
the variables sys.exc_traceback (deprecated) and sys.last_traceback and
returned as the third item from sys.exc_info.
The module defines the following functions:
print_tb(traceback[, limit[, file]])~
Print up to {limit} stack trace entries from {traceback}. If {limit} is omitted
or ``None``, all entries are printed. If {file} is omitted or ``None``, the
output goes to ``sys.stderr``; otherwise it should be an open file or file-like
object to receive the output.
print_exception(type, value, traceback[, limit[, file]])~
Print exception information and up to {limit} stack trace entries from
{traceback} to {file}. This differs from print_tb in the following ways:
(1) if {traceback} is not ``None``, it prints a header ``Traceback (most recent
call last):``; (2) it prints the exception {type} and {value} after the stack
trace; (3) if {type} is SyntaxError and {value} has the appropriate
format, it prints the line where the syntax error occurred with a caret
indicating the approximate position of the error.
print_exc([limit[, file]])~
This is a shorthand for ``print_exception(sys.exc_type, sys.exc_value,
sys.exc_traceback, limit, file)``. (In fact, it uses sys.exc_info to
retrieve the same information in a thread-safe way instead of using the
deprecated variables.)
format_exc([limit])~
This is like ``print_exc(limit)`` but returns a string instead of printing to a
file.
.. versionadded:: 2.4
print_last([limit[, file]])~
This is a shorthand for ``print_exception(sys.last_type, sys.last_value,
sys.last_traceback, limit, file)``. In general it will work only after
an exception has reached an interactive prompt (see sys.last_type).
print_stack([f[, limit[, file]]])~
This function prints a stack trace from its invocation point. The optional {f}
argument can be used to specify an alternate stack frame to start. The optional
{limit} and {file} arguments have the same meaning as for
print_exception.
extract_tb(traceback[, limit])~
Return a list of up to {limit} "pre-processed" stack trace entries extracted
from the traceback object {traceback}. It is useful for alternate formatting of
stack traces. If {limit} is omitted or ``None``, all entries are extracted. A
"pre-processed" stack trace entry is a quadruple ({filename}, {line number},
{function name}, {text}) representing the information that is usually printed
for a stack trace. The {text} is a string with leading and trailing whitespace
stripped; if the source is not available it is ``None``.
extract_stack([f[, limit]])~
Extract the raw traceback from the current stack frame. The return value has
the same format as for extract_tb. The optional {f} and {limit}
arguments have the same meaning as for print_stack.
format_list(list)~
Given a list of tuples as returned by extract_tb or
extract_stack, return a list of strings ready for printing. Each string
in the resulting list corresponds to the item with the same index in the
argument list. Each string ends in a newline; the strings may contain internal
newlines as well, for those items whose source text line is not ``None``.
format_exception_only(type, value)~
Format the exception part of a traceback. The arguments are the exception type
and value such as given by ``sys.last_type`` and ``sys.last_value``. The return
value is a list of strings, each ending in a newline. Normally, the list
contains a single string; however, for SyntaxError exceptions, it
contains several lines that (when printed) display detailed information about
where the syntax error occurred. The message indicating which exception
occurred is the always last string in the list.
format_exception(type, value, tb[, limit])~
Format a stack trace and the exception information. The arguments have the
same meaning as the corresponding arguments to print_exception. The
return value is a list of strings, each ending in a newline and some containing
internal newlines. When these lines are concatenated and printed, exactly the
same text is printed as does print_exception.
format_tb(tb[, limit])~
A shorthand for ``format_list(extract_tb(tb, limit))``.
format_stack([f[, limit]])~
A shorthand for ``format_list(extract_stack(f, limit))``.
tb_lineno(tb)~
This function returns the current line number set in the traceback object. This
function was necessary because in versions of Python prior to 2.3 when the
-O flag was passed to Python the ``tb.tb_lineno`` was not updated
correctly. This function has no use in versions past 2.3.
Traceback Examples
------------------
This simple example implements a basic read-eval-print loop, similar to (but
less useful than) the standard Python interactive interpreter loop. For a more
complete implementation of the interpreter loop, refer to the code (|py2stdlib-code|)
module. :: >
import sys, traceback
def run_user_code(envdir):
source = raw_input(">>> ")
try:
exec source in envdir
except:
print "Exception in user code:"
print '-'*60
traceback.print_exc(file=sys.stdout)
print '-'*60
envdir = {}
while 1:
run_user_code(envdir)
<
The following example demonstrates the different ways to print and format the
exception and traceback:: >
import sys, traceback
def lumberjack():
bright_side_of_death()
def bright_side_of_death():
return tuple()[0]
try:
lumberjack()
except IndexError:
exc_type, exc_value, exc_traceback = sys.exc_info()
print "{} print_tb:"
traceback.print_tb(exc_traceback, limit=1, file=sys.stdout)
print "{} print_exception:"
traceback.print_exception(exc_type, exc_value, exc_traceback,
limit=2, file=sys.stdout)
print "{} print_exc:"
traceback.print_exc()
print "{} format_exc, first and last line:"
formatted_lines = traceback.format_exc().splitlines()
print formatted_lines[0]
print formatted_lines[-1]
print "{} format_exception:"
print repr(traceback.format_exception(exc_type, exc_value,
exc_traceback))
print "{} extract_tb:"
print repr(traceback.extract_tb(exc_traceback))
print "{} format_tb:"
print repr(traceback.format_tb(exc_traceback))
print "{} tb_lineno:", exc_traceback.tb_lineno
<
The output for the example would look similar to this::
{} print_tb:
File "<doctest...>", line 10, in <module>
lumberjack()
{} print_exception:
Traceback (most recent call last):
File "<doctest...>", line 10, in <module>
lumberjack()
File "<doctest...>", line 4, in lumberjack
bright_side_of_death()
IndexError: tuple index out of range
{} print_exc:
Traceback (most recent call last):
File "<doctest...>", line 10, in <module>
lumberjack()
File "<doctest...>", line 4, in lumberjack
bright_side_of_death()
IndexError: tuple index out of range
{} format_exc, first and last line:
Traceback (most recent call last):
IndexError: tuple index out of range
{} format_exception:
['Traceback (most recent call last):\n',
' File "<doctest...>", line 10, in <module>\n lumberjack()\n',
' File "<doctest...>", line 4, in lumberjack\n bright_side_of_death()\n',
' File "<doctest...>", line 7, in bright_side_of_death\n return tuple()[0]\n',
'IndexError: tuple index out of range\n']
{} extract_tb:
[('<doctest...>', 10, '<module>', 'lumberjack()'),
('<doctest...>', 4, 'lumberjack', 'bright_side_of_death()'),
('<doctest...>', 7, 'bright_side_of_death', 'return tuple()[0]')]
{} format_tb:
[' File "<doctest...>", line 10, in <module>\n lumberjack()\n',
' File "<doctest...>", line 4, in lumberjack\n bright_side_of_death()\n',
' File "<doctest...>", line 7, in bright_side_of_death\n return tuple()[0]\n']
{} tb_lineno: 10
The following example shows the different ways to print and format the stack:: >
>>> import traceback
>>> def another_function():
... lumberstack()
...
>>> def lumberstack():
... traceback.print_stack()
... print repr(traceback.extract_stack())
... print repr(traceback.format_stack())
...
>>> another_function()
File "<doctest>", line 10, in <module>
another_function()
File "<doctest>", line 3, in another_function
lumberstack()
File "<doctest>", line 6, in lumberstack
traceback.print_stack()
[('<doctest>', 10, '<module>', 'another_function()'),
('<doctest>', 3, 'another_function', 'lumberstack()'),
('<doctest>', 7, 'lumberstack', 'print repr(traceback.extract_stack())')]
[' File "<doctest>", line 10, in <module>\n another_function()\n',
' File "<doctest>", line 3, in another_function\n lumberstack()\n',
' File "<doctest>", line 8, in lumberstack\n print repr(traceback.format_stack())\n']
<
This last example demonstrates the final few formatting functions:
.. doctest::
:options: +NORMALIZE_WHITESPACE
>>> import traceback
>>> traceback.format_list([('spam.py', 3, '<module>', 'spam.eggs()'),
... ('eggs.py', 42, 'eggs', 'return "bacon"')])
[' File "spam.py", line 3, in <module>\n spam.eggs()\n',
' File "eggs.py", line 42, in eggs\n return "bacon"\n']
>>> an_error = IndexError('tuple index out of range')
>>> traceback.format_exception_only(type(an_error), an_error)
['IndexError: tuple index out of range\n']
==============================================================================
*py2stdlib-ttk*
ttk~
:synopsis: Tk themed widget set
.. index:: single: ttk
The ttk (|py2stdlib-ttk|) module provides access to the Tk themed widget set, which has
been introduced in Tk 8.5. If Python is not compiled against Tk 8.5 code may
still use this module as long as Tile is installed. However, some features
provided by the new Tk, like anti-aliased font rendering under X11, window
transparency (on X11 you will need a composition window manager) will be
missing.
The basic idea of ttk (|py2stdlib-ttk|) is to separate, to the extent possible, the code
implementing a widget's behavior from the code implementing its appearance.
.. seealso::
`Tk Widget Styling Support <http://www.tcl.tk/cgi-bin/tct/tip/48>`_
The document which brought up theming support for Tk
Using Ttk
---------
To start using Ttk, import its module:: >
import ttk
<
But code like this::
from Tkinter import *
may optionally want to use this:: >
from Tkinter import *
from ttk import *
<
And then several ttk (|py2stdlib-ttk|) widgets (Button, Checkbutton,
Entry, Frame, Label, LabelFrame,
Menubutton, PanedWindow, Radiobutton, Scale
and Scrollbar) will automatically substitute for the Tk widgets.
This has the direct benefit of using the new widgets, giving better look & feel
across platforms, but be aware that they are not totally compatible. The main
difference is that widget options such as "fg", "bg" and others related to
widget styling are no longer present in Ttk widgets. Use ttk.Style to
achieve the same (or better) styling.
.. seealso::
`Converting existing applications to use the Tile widgets <http://tktable.sourceforge.net/tile/doc/converting.txt>`_
A text which talks in Tcl terms about differences typically found when
converting applications to use the new widgets.
Ttk Widgets
-----------
Ttk comes with 17 widgets, 11 of which already exist in Tkinter:
Button, Checkbutton, Entry, Frame,
Label, LabelFrame, Menubutton,
PanedWindow, Radiobutton, Scale and
Scrollbar. The 6 new widget classes are: Combobox,
Notebook, Progressbar, Separator,
Sizegrip and Treeview. All of these classes are
subclasses of Widget.
As said previously, you will notice changes in look-and-feel as well in the
styling code. To demonstrate the latter, a very simple example is shown below.
Tk code:: >
l1 = Tkinter.Label(text="Test", fg="black", bg="white")
l2 = Tkinter.Label(text="Test", fg="black", bg="white")
<
Corresponding Ttk code::
style = ttk.Style()
style.configure("BW.TLabel", foreground="black", background="white")
l1 = ttk.Label(text="Test", style="BW.TLabel")
l2 = ttk.Label(text="Test", style="BW.TLabel")
For more information about TtkStyling_ read the Style class
documentation.
Widget
------
ttk.Widget defines standard options and methods supported by Tk
themed widgets and is not supposed to be directly instantiated.
Standard Options
^^^^^^^^^^^^^^^^
All the ttk (|py2stdlib-ttk|) widgets accept the following options:
+-----------+--------------------------------------------------------------+
| Option | Description |
+===========+==============================================================+
| class | Specifies the window class. The class is used when querying |
| | the option database for the window's other options, to |
| | determine the default bindtags for the window, and to select |
| | the widget's default layout and style. This is a read-only |
| | option which may only be specified when the window is |
| | created. |
+-----------+--------------------------------------------------------------+
| cursor | Specifies the mouse cursor to be used for the widget. If set |
| | to the empty string (the default), the cursor is inherited |
| | from the parent widget. |
+-----------+--------------------------------------------------------------+
| takefocus | Determines whether the window accepts the focus during |
| | keyboard traversal. 0, 1 or an empty string is returned. |
| | If 0, the window should be skipped entirely |
| | during keyboard traversal. If 1, the window |
| | should receive the input focus as long as it is viewable. |
| | An empty string means that the traversal scripts make the |
| | decision about whether or not to focus on the window. |
+-----------+--------------------------------------------------------------+
| style | May be used to specify a custom widget style. |
+-----------+--------------------------------------------------------------+
Scrollable Widget Options
^^^^^^^^^^^^^^^^^^^^^^^^^
The following options are supported by widgets that are controlled by a
scrollbar.
+----------------+---------------------------------------------------------+
| option | description |
+================+=========================================================+
| xscrollcommand | Used to communicate with horizontal scrollbars. |
| | |
| | When the view in the widget's window changes, the widget|
| | will generate a Tcl command based on the scrollcommand. |
| | |
| | Usually this option consists of the |
| | Scrollbar.set method of some scrollbar. This |
| | will cause |
| | the scrollbar to be updated whenever the view in the |
| | window changes. |
+----------------+---------------------------------------------------------+
| yscrollcommand | Used to communicate with vertical scrollbars. |
| | For more information, see above. |
+----------------+---------------------------------------------------------+
Label Options
^^^^^^^^^^^^^
The following options are supported by labels, buttons and other button-like
widgets.
.. tabularcolumns:: |p{0.2\textwidth}|p{0.7\textwidth}|
..
+--------------+-----------------------------------------------------------+
| option | description |
+==============+===========================================================+
| text | Specifies a text string to be displayed inside the widget.|
+--------------+-----------------------------------------------------------+
| textvariable | Specifies a name whose value will be used in place of the |
| | text option resource. |
+--------------+-----------------------------------------------------------+
| underline | If set, specifies the index (0-based) of a character to |
| | underline in the text string. The underline character is |
| | used for mnemonic activation. |
+--------------+-----------------------------------------------------------+
| image | Specifies an image to display. This is a list of 1 or more|
| | elements. The first element is the default image name. The|
| | rest of the list is a sequence of statespec/value pairs as|
| | defined by Style.map, specifying different images |
| | to use when the widget is in a particular state or a |
| | combination of states. All images in the list should have |
| | the same size. |
+--------------+-----------------------------------------------------------+
| compound | Specifies how to display the image relative to the text, |
| | in the case both text and image options are present. |
| | Valid values are: |
| | |
| | * text: display text only |
| | * image: display image only |
| | * top, bottom, left, right: display image above, below, |
| | left of, or right of the text, respectively. |
| | * none: the default. display the image if present, |
| | otherwise the text. |
+--------------+-----------------------------------------------------------+
| width | If greater than zero, specifies how much space, in |
| | character widths, to allocate for the text label; if less |
| | than zero, specifies a minimum width. If zero or |
| | unspecified, the natural width of the text label is used. |
+--------------+-----------------------------------------------------------+
Compatibility Options
^^^^^^^^^^^^^^^^^^^^^
+--------+----------------------------------------------------------------+
| option | description |
+========+================================================================+
| state | May be set to "normal" or "disabled" to control the "disabled" |
| | state bit. This is a write-only option: setting it changes the |
| | widget state, but the Widget.state method does not |
| | affect this option. |
+--------+----------------------------------------------------------------+
Widget States
^^^^^^^^^^^^^
The widget state is a bitmap of independent state flags.
+------------+-------------------------------------------------------------+
| flag | description |
+============+=============================================================+
| active | The mouse cursor is over the widget and pressing a mouse |
| | button will cause some action to occur. |
+------------+-------------------------------------------------------------+
| disabled | Widget is disabled under program control. |
+------------+-------------------------------------------------------------+
| focus | Widget has keyboard focus. |
+------------+-------------------------------------------------------------+
| pressed | Widget is being pressed. |
+------------+-------------------------------------------------------------+
| selected | "On", "true", or "current" for things like Checkbuttons and |
| | radiobuttons. |
+------------+-------------------------------------------------------------+
| background | Windows and Mac have a notion of an "active" or foreground |
| | window. The {background} state is set for widgets in a |
| | background window, and cleared for those in the foreground |
| | window. |
+------------+-------------------------------------------------------------+
| readonly | Widget should not allow user modification. |
+------------+-------------------------------------------------------------+
| alternate | A widget-specific alternate display format. |
+------------+-------------------------------------------------------------+
| invalid | The widget's value is invalid. |
+------------+-------------------------------------------------------------+
A state specification is a sequence of state names, optionally prefixed with
an exclamation point indicating that the bit is off.
ttk.Widget
^^^^^^^^^^
Besides the methods described below, the ttk.Widget class supports the
Tkinter.Widget.cget and Tkinter.Widget.configure methods.
Widget~
identify(x, y)~
Returns the name of the element at position {x} {y}, or the empty string
if the point does not lie within any element.
{x} and {y} are pixel coordinates relative to the widget.
instate(statespec[, callback=None[, {args[, }*kw]]])~
Test the widget's state. If a callback is not specified, returns True
if the widget state matches {statespec} and False otherwise. If callback
is specified then it is called with {args} if widget state matches
{statespec}.
state([statespec=None])~
Modify or read widget state. If {statespec} is specified, sets the
widget state accordingly and returns a new {statespec} indicating
which flags were changed. If {statespec} is not specified, returns
the currently-enabled state flags.
{statespec} will usually be a list or a tuple.
Combobox
--------
The ttk.Combobox widget combines a text field with a pop-down list of
values. This widget is a subclass of Entry.
Besides the methods inherited from Widget (Widget.cget,
Widget.configure, Widget.identify, Widget.instate
and Widget.state) and those inherited from Entry
(Entry.bbox, Entry.delete, Entry.icursor,
Entry.index, Entry.inset, Entry.selection,
Entry.xview), this class has some other methods, described at
ttk.Combobox.
Options
^^^^^^^
This widget accepts the following options:
+-----------------+--------------------------------------------------------+
| option | description |
+=================+========================================================+
| exportselection | Boolean value. If set, the widget selection is linked |
| | to the Window Manager selection (which can be returned |
| | by invoking Misc.selection_get, for example). |
+-----------------+--------------------------------------------------------+
| justify | Specifies how the text is aligned within the widget. |
| | One of "left", "center", or "right". |
+-----------------+--------------------------------------------------------+
| height | Specifies the height of the pop-down listbox, in rows. |
+-----------------+--------------------------------------------------------+
| postcommand | A script (possibly registered with |
| | Misc.register) that |
| | is called immediately before displaying the values. It |
| | may specify which values to display. |
+-----------------+--------------------------------------------------------+
| state | One of "normal", "readonly", or "disabled". In the |
| | "readonly" state, the value may not be edited directly,|
| | and the user can only select one of the values from the|
| | dropdown list. In the "normal" state, the text field is|
| | directly editable. In the "disabled" state, no |
| | interaction is possible. |
+-----------------+--------------------------------------------------------+
| textvariable | Specifies a name whose value is linked to the widget |
| | value. Whenever the value associated with that name |
| | changes, the widget value is updated, and vice versa. |
| | See Tkinter.StringVar. |
+-----------------+--------------------------------------------------------+
| values | Specifies the list of values to display in the |
| | drop-down listbox. |
+-----------------+--------------------------------------------------------+
| width | Specifies an integer value indicating the desired width|
| | of the entry window, in average-size characters of the |
| | widget's font. |
+-----------------+--------------------------------------------------------+
Virtual events
^^^^^^^^^^^^^^
The combobox widget generates a {<<ComboboxSelected>>}* virtual event
when the user selects an element from the list of values.
ttk.Combobox
^^^^^^^^^^^^
Combobox~
current([newindex=None])~
If {newindex} is specified, sets the combobox value to the element
position {newindex}. Otherwise, returns the index of the current value or
-1 if the current value is not in the values list.
get()~
Returns the current value of the combobox.
set(value)~
Sets the value of the combobox to {value}.
Notebook
--------
The Ttk Notebook widget manages a collection of windows and displays a single
one at a time. Each child window is associated with a tab, which the user
may select to change the currently-displayed window.
Options
^^^^^^^
This widget accepts the following specific options:
+---------+----------------------------------------------------------------+
| option | description |
+=========+================================================================+
| height | If present and greater than zero, specifies the desired height |
| | of the pane area (not including internal padding or tabs). |
| | Otherwise, the maximum height of all panes is used. |
+---------+----------------------------------------------------------------+
| padding | Specifies the amount of extra space to add around the outside |
| | of the notebook. The padding is a list of up to four length |
| | specifications: left top right bottom. If fewer than four |
| | elements are specified, bottom defaults to top, right defaults |
| | to left, and top defaults to left. |
+---------+----------------------------------------------------------------+
| width | If present and greater than zero, specifies the desired width |
| | of the pane area (not including internal padding). Otherwise, |
| | the maximum width of all panes is used. |
+---------+----------------------------------------------------------------+
Tab Options
^^^^^^^^^^^
There are also specific options for tabs:
+-----------+--------------------------------------------------------------+
| option | description |
+===========+==============================================================+
| state | Either "normal", "disabled" or "hidden". If "disabled", then |
| | the tab is not selectable. If "hidden", then the tab is not |
| | shown. |
+-----------+--------------------------------------------------------------+
| sticky | Specifies how the child window is positioned within the pane |
| | area. Value is a string containing zero or more of the |
| | characters "n", "s", "e" or "w". Each letter refers to a |
| | side (north, south, east or west) that the child window will |
| | stick to, as per the grid geometry manager. |
+-----------+--------------------------------------------------------------+
| padding | Specifies the amount of extra space to add between the |
| | notebook and this pane. Syntax is the same as for the option |
| | padding used by this widget. |
+-----------+--------------------------------------------------------------+
| text | Specifies a text to be displayed in the tab. |
+-----------+--------------------------------------------------------------+
| image | Specifies an image to display in the tab. See the option |
| | image described in Widget. |
+-----------+--------------------------------------------------------------+
| compound | Specifies how to display the image relative to the text, in |
| | the case both text and image options are present. See |
| | `Label Options`_ for legal values. |
+-----------+--------------------------------------------------------------+
| underline | Specifies the index (0-based) of a character to underline in |
| | the text string. The underlined character is used for |
| | mnemonic activation if Notebook.enable_traversal is |
| | called. |
+-----------+--------------------------------------------------------------+
Tab Identifiers
^^^^^^^^^^^^^^^
The {tab_id} present in several methods of ttk.Notebook may take any
of the following forms:
* An integer between zero and the number of tabs.
* The name of a child window.
* A positional specification of the form "@x,y", which identifies the tab.
* The literal string "current", which identifies the currently-selected tab.
* The literal string "end", which returns the number of tabs (only valid for
Notebook.index).
Virtual Events
^^^^^^^^^^^^^^
This widget generates a {<<NotebookTabChanged>>}* virtual event after a new
tab is selected.
ttk.Notebook
^^^^^^^^^^^^
Notebook~
add(child, {}kw)~
Adds a new tab to the notebook.
If window is currently managed by the notebook but hidden, it is
restored to its previous position.
See `Tab Options`_ for the list of available options.
forget(tab_id)~
Removes the tab specified by {tab_id}, unmaps and unmanages the
associated window.
hide(tab_id)~
Hides the tab specified by {tab_id}.
The tab will not be displayed, but the associated window remains
managed by the notebook and its configuration remembered. Hidden tabs
may be restored with the add command.
identify(x, y)~
Returns the name of the tab element at position {x}, {y}, or the empty
string if none.
index(tab_id)~
Returns the numeric index of the tab specified by {tab_id}, or the total
number of tabs if {tab_id} is the string "end".
insert(pos, child, {}kw)~
Inserts a pane at the specified position.
{pos} is either the string "end", an integer index, or the name of a
managed child. If {child} is already managed by the notebook, moves it to
the specified position.
See `Tab Options`_ for the list of available options.
select([tab_id])~
Selects the specified {tab_id}.
The associated child window will be displayed, and the
previously-selected window (if different) is unmapped. If {tab_id} is
omitted, returns the widget name of the currently selected pane.
tab(tab_id[, option=None[, {}kw]])~
Query or modify the options of the specific {tab_id}.
If {kw} is not given, returns a dictionary of the tab option values. If
{option} is specified, returns the value of that {option}. Otherwise,
sets the options to the corresponding values.
tabs()~
Returns a list of windows managed by the notebook.
enable_traversal()~
Enable keyboard traversal for a toplevel window containing this notebook.
This will extend the bindings for the toplevel window containing the
notebook as follows:
* Control-Tab: selects the tab following the currently selected one.
* Shift-Control-Tab: selects the tab preceding the currently selected one.
* Alt-K: where K is the mnemonic (underlined) character of any tab, will
select that tab.
Multiple notebooks in a single toplevel may be enabled for traversal,
including nested notebooks. However, notebook traversal only works
properly if all panes have the notebook they are in as master.
Progressbar
-----------
The ttk.Progressbar widget shows the status of a long-running
operation. It can operate in two modes: determinate mode shows the amount
completed relative to the total amount of work to be done, and indeterminate
mode provides an animated display to let the user know that something is
happening.
Options
^^^^^^^
This widget accepts the following specific options:
+----------+---------------------------------------------------------------+
| option | description |
+==========+===============================================================+
| orient | One of "horizontal" or "vertical". Specifies the orientation |
| | of the progress bar. |
+----------+---------------------------------------------------------------+
| length | Specifies the length of the long axis of the progress bar |
| | (width if horizontal, height if vertical). |
+----------+---------------------------------------------------------------+
| mode | One of "determinate" or "indeterminate". |
+----------+---------------------------------------------------------------+
| maximum | A number specifying the maximum value. Defaults to 100. |
+----------+---------------------------------------------------------------+
| value | The current value of the progress bar. In "determinate" mode, |
| | this represents the amount of work completed. In |
| | "indeterminate" mode, it is interpreted as modulo {maximum}; |
| | that is, the progress bar completes one "cycle" when its value|
| | increases by {maximum}. |
+----------+---------------------------------------------------------------+
| variable | A name which is linked to the option value. If specified, the |
| | value of the progress bar is automatically set to the value of|
| | this name whenever the latter is modified. |
+----------+---------------------------------------------------------------+
| phase | Read-only option. The widget periodically increments the value|
| | of this option whenever its value is greater than 0 and, in |
| | determinate mode, less than maximum. This option may be used |
| | by the current theme to provide additional animation effects. |
+----------+---------------------------------------------------------------+
ttk.Progressbar
^^^^^^^^^^^^^^^
Progressbar~
start([interval])~
Begin autoincrement mode: schedules a recurring timer event that calls
Progressbar.step every {interval} milliseconds. If omitted,
{interval} defaults to 50 milliseconds.
step([amount])~
Increments the progress bar's value by {amount}.
{amount} defaults to 1.0 if omitted.
stop()~
Stop autoincrement mode: cancels any recurring timer event initiated by
Progressbar.start for this progress bar.
Separator
---------
The ttk.Separator widget displays a horizontal or vertical separator
bar.
It has no other methods besides the ones inherited from ttk.Widget.
Options
^^^^^^^
This widget accepts the following specific option:
+--------+----------------------------------------------------------------+
| option | description |
+========+================================================================+
| orient | One of "horizontal" or "vertical". Specifies the orientation of|
| | the separator. |
+--------+----------------------------------------------------------------+
Sizegrip
--------
The ttk.Sizegrip widget (also known as a grow box) allows the user to
resize the containing toplevel window by pressing and dragging the grip.
This widget has neither specific options nor specific methods, besides the
ones inherited from ttk.Widget.
Platform-specific notes
^^^^^^^^^^^^^^^^^^^^^^^
* On Mac OS X, toplevel windows automatically include a built-in size grip
by default. Adding a Sizegrip is harmless, since the built-in
grip will just mask the widget.
Bugs
^^^^
* If the containing toplevel's position was specified relative to the right
or bottom of the screen (e.g. ....), the Sizegrip widget will
not resize the window.
* This widget supports only "southeast" resizing.
Treeview
--------
The ttk.Treeview widget displays a hierarchical collection of items.
Each item has a textual label, an optional image, and an optional list of data
values. The data values are displayed in successive columns after the tree
label.
The order in which data values are displayed may be controlled by setting
the widget option ``displaycolumns``. The tree widget can also display column
headings. Columns may be accessed by number or symbolic names listed in the
widget option columns. See `Column Identifiers`_.
Each item is identified by an unique name. The widget will generate item IDs
if they are not supplied by the caller. There is a distinguished root item,
named ``{}``. The root item itself is not displayed; its children appear at the
top level of the hierarchy.
Each item also has a list of tags, which can be used to associate event bindings
with individual items and control the appearance of the item.
The Treeview widget supports horizontal and vertical scrolling, according to
the options described in `Scrollable Widget Options`_ and the methods
Treeview.xview and Treeview.yview.
Options
^^^^^^^
This widget accepts the following specific options:
.. tabularcolumns:: |p{0.2\textwidth}|p{0.7\textwidth}|
..
+----------------+--------------------------------------------------------+
| option | description |
+================+========================================================+
| columns | A list of column identifiers, specifying the number of |
| | columns and their names. |
+----------------+--------------------------------------------------------+
| displaycolumns | A list of column identifiers (either symbolic or |
| | integer indices) specifying which data columns are |
| | displayed and the order in which they appear, or the |
| | string "#all". |
+----------------+--------------------------------------------------------+
| height | Specifies the number of rows which should be visible. |
| | Note: the requested width is determined from the sum |
| | of the column widths. |
+----------------+--------------------------------------------------------+
| padding | Specifies the internal padding for the widget. The |
| | padding is a list of up to four length specifications. |
+----------------+--------------------------------------------------------+
| selectmode | Controls how the built-in class bindings manage the |
| | selection. One of "extended", "browse" or "none". |
| | If set to "extended" (the default), multiple items may |
| | be selected. If "browse", only a single item will be |
| | selected at a time. If "none", the selection will not |
| | be changed. |
| | |
| | Note that the application code and tag bindings can set|
| | the selection however they wish, regardless of the |
| | value of this option. |
+----------------+--------------------------------------------------------+
| show | A list containing zero or more of the following values,|
| | specifying which elements of the tree to display. |
| | |
| | * tree: display tree labels in column #0. |
| | * headings: display the heading row. |
| | |
| | The default is "tree headings", i.e., show all |
| | elements. |
| | |
| | {Note}*: Column #0 always refers to the tree column, |
| | even if show="tree" is not specified. |
+----------------+--------------------------------------------------------+
Item Options
^^^^^^^^^^^^
The following item options may be specified for items in the insert and item
widget commands.
+--------+---------------------------------------------------------------+
| option | description |
+========+===============================================================+
| text | The textual label to display for the item. |
+--------+---------------------------------------------------------------+
| image | A Tk Image, displayed to the left of the label. |
+--------+---------------------------------------------------------------+
| values | The list of values associated with the item. |
| | |
| | Each item should have the same number of values as the widget |
| | option columns. If there are fewer values than columns, the |
| | remaining values are assumed empty. If there are more values |
| | than columns, the extra values are ignored. |
+--------+---------------------------------------------------------------+
| open | True/False value indicating whether the item's children should|
| | be displayed or hidden. |
+--------+---------------------------------------------------------------+
| tags | A list of tags associated with this item. |
+--------+---------------------------------------------------------------+
Tag Options
^^^^^^^^^^^
The following options may be specified on tags:
+------------+-----------------------------------------------------------+
| option | description |
+============+===========================================================+
| foreground | Specifies the text foreground color. |
+------------+-----------------------------------------------------------+
| background | Specifies the cell or item background color. |
+------------+-----------------------------------------------------------+
| font | Specifies the font to use when drawing text. |
+------------+-----------------------------------------------------------+
| image | Specifies the item image, in case the item's image option |
| | is empty. |
+------------+-----------------------------------------------------------+
Column Identifiers
^^^^^^^^^^^^^^^^^^
Column identifiers take any of the following forms:
* A symbolic name from the list of columns option.
* An integer n, specifying the nth data column.
* A string of the form #n, where n is an integer, specifying the nth display
column.
Notes:
* Item's option values may be displayed in a different order than the order
in which they are stored.
* Column #0 always refers to the tree column, even if show="tree" is not
specified.
A data column number is an index into an item's option values list; a display
column number is the column number in the tree where the values are displayed.
Tree labels are displayed in column #0. If option displaycolumns is not set,
then data column n is displayed in column #n+1. Again, {}column #0 always
refers to the tree column{}.
Virtual Events
^^^^^^^^^^^^^^
The Treeview widget generates the following virtual events.
+--------------------+--------------------------------------------------+
| event | description |
+====================+==================================================+
| <<TreeviewSelect>> | Generated whenever the selection changes. |
+--------------------+--------------------------------------------------+
| <<TreeviewOpen>> | Generated just before settings the focus item to |
| | open=True. |
+--------------------+--------------------------------------------------+
| <<TreeviewClose>> | Generated just after setting the focus item to |
| | open=False. |
+--------------------+--------------------------------------------------+
The Treeview.focus and Treeview.selection methods can be used
to determine the affected item or items.
ttk.Treeview
^^^^^^^^^^^^
Treeview~
bbox(item[, column=None])~
Returns the bounding box (relative to the treeview widget's window) of
the specified {item} in the form (x, y, width, height).
If {column} is specified, returns the bounding box of that cell. If the
{item} is not visible (i.e., if it is a descendant of a closed item or is
scrolled offscreen), returns an empty string.
get_children([item])~
Returns the list of children belonging to {item}.
If {item} is not specified, returns root children.
set_children(item, *newchildren)~
Replaces {item}'s child with {newchildren}.
Children present in {item} that are not present in {newchildren} are
detached from the tree. No items in {newchildren} may be an ancestor of
{item}. Note that not specifying {newchildren} results in detaching
{item}'s children.
column(column[, option=None[, {}kw]])~
Query or modify the options for the specified {column}.
If {kw} is not given, returns a dict of the column option values. If
{option} is specified then the value for that {option} is returned.
Otherwise, sets the options to the corresponding values.
The valid options/values are:
* id
Returns the column name. This is a read-only option.
* anchor: One of the standard Tk anchor values.
Specifies how the text in this column should be aligned with respect
to the cell.
* minwidth: width
The minimum width of the column in pixels. The treeview widget will
not make the column any smaller than specified by this option when
the widget is resized or the user drags a column.
* stretch: True/False
Specifies whether the column's width should be adjusted when
the widget is resized.
* width: width
The width of the column in pixels.
To configure the tree column, call this with column = "#0"
delete(*items)~
Delete all specified {items} and all their descendants.
The root item may not be deleted.
detach(*items)~
Unlinks all of the specified {items} from the tree.
The items and all of their descendants are still present, and may be
reinserted at another point in the tree, but will not be displayed.
The root item may not be detached.
exists(item)~
Returns True if the specified {item} is present in the tree.
focus([item=None])~
If {item} is specified, sets the focus item to {item}. Otherwise, returns
the current focus item, or '' if there is none.
heading(column[, option=None[, {}kw]])~
Query or modify the heading options for the specified {column}.
If {kw} is not given, returns a dict of the heading option values. If
{option} is specified then the value for that {option} is returned.
Otherwise, sets the options to the corresponding values.
The valid options/values are:
* text: text
The text to display in the column heading.
* image: imageName
Specifies an image to display to the right of the column heading.
* anchor: anchor
Specifies how the heading text should be aligned. One of the standard
Tk anchor values.
* command: callback
A callback to be invoked when the heading label is pressed.
To configure the tree column heading, call this with column = "#0".
identify(component, x, y)~
Returns a description of the specified {component} under the point given
by {x} and {y}, or the empty string if no such {component} is present at
that position.
identify_row(y)~
Returns the item ID of the item at position {y}.
identify_column(x)~
Returns the data column identifier of the cell at position {x}.
The tree column has ID #0.
identify_region(x, y)~
Returns one of:
+-----------+--------------------------------------+
| region | meaning |
+===========+======================================+
| heading | Tree heading area. |
+-----------+--------------------------------------+
| separator | Space between two columns headings. |
+-----------+--------------------------------------+
| tree | The tree area. |
+-----------+--------------------------------------+
| cell | A data cell. |
+-----------+--------------------------------------+
Availability: Tk 8.6.
identify_element(x, y)~
Returns the element at position {x}, {y}.
Availability: Tk 8.6.
index(item)~
Returns the integer index of {item} within its parent's list of children.
insert(parent, index[, iid=None[, {}kw]])~
Creates a new item and returns the item identifier of the newly created
item.
{parent} is the item ID of the parent item, or the empty string to create
a new top-level item. {index} is an integer, or the value "end",
specifying where in the list of parent's children to insert the new item.
If {index} is less than or equal to zero, the new node is inserted at
the beginning; if {index} is greater than or equal to the current number
of children, it is inserted at the end. If {iid} is specified, it is used
as the item identifier; {iid} must not already exist in the tree.
Otherwise, a new unique identifier is generated.
See `Item Options`_ for the list of available points.
item(item[, option[, {}kw]])~
Query or modify the options for the specified {item}.
If no options are given, a dict with options/values for the item is
returned.
If {option} is specified then the value for that option is returned.
Otherwise, sets the options to the corresponding values as given by {kw}.
move(item, parent, index)~
Moves {item} to position {index} in {parent}'s list of children.
It is illegal to move an item under one of its descendants. If {index} is
less than or equal to zero, {item} is moved to the beginning; if greater
than or equal to the number of children, it is moved to the end. If {item}
was detached it is reattached.
next(item)~
Returns the identifier of {item}'s next sibling, or '' if {item} is the
last child of its parent.
parent(item)~
Returns the ID of the parent of {item}, or '' if {item} is at the top
level of the hierarchy.
prev(item)~
Returns the identifier of {item}'s previous sibling, or '' if {item} is
the first child of its parent.
reattach(item, parent, index)~
An alias for Treeview.move.
see(item)~
Ensure that {item} is visible.
Sets all of {item}'s ancestors open option to True, and scrolls the
widget if necessary so that {item} is within the visible portion of
the tree.
selection([selop=None[, items=None]])~
If {selop} is not specified, returns selected items. Otherwise, it will
act according to the following selection methods.
selection_set(items)~
{items} becomes the new selection.
selection_add(items)~
Add {items} to the selection.
selection_remove(items)~
Remove {items} from the selection.
selection_toggle(items)~
Toggle the selection state of each item in {items}.
set(item[, column=None[, value=None]])~
With one argument, returns a dictionary of column/value pairs for the
specified {item}. With two arguments, returns the current value of the
specified {column}. With three arguments, sets the value of given
{column} in given {item} to the specified {value}.
tag_bind(tagname[, sequence=None[, callback=None]])~
Bind a callback for the given event {sequence} to the tag {tagname}.
When an event is delivered to an item, the callbacks for each of the
item's tags option are called.
tag_configure(tagname[, option=None[, {}kw]])~
Query or modify the options for the specified {tagname}.
If {kw} is not given, returns a dict of the option settings for
{tagname}. If {option} is specified, returns the value for that {option}
for the specified {tagname}. Otherwise, sets the options to the
corresponding values for the given {tagname}.
tag_has(tagname[, item])~
If {item} is specified, returns 1 or 0 depending on whether the specified
{item} has the given {tagname}. Otherwise, returns a list of all items
that have the specified tag.
Availability: Tk 8.6
xview(*args)~
Query or modify horizontal position of the treeview.
yview(*args)~
Query or modify vertical position of the treeview.
Ttk Styling
-----------
Each widget in ttk (|py2stdlib-ttk|) is assigned a style, which specifies the set of
elements making up the widget and how they are arranged, along with dynamic and
default settings for element options. By default the style name is the same as
the widget's class name, but it may be overridden by the widget's style
option. If the class name of a widget is unknown, use the method
Misc.winfo_class (somewidget.winfo_class()).
.. seealso::
`Tcl'2004 conference presentation <http://tktable.sourceforge.net/tile/tile-tcl2004.pdf>`_
This document explains how the theme engine works
Style~
This class is used to manipulate the style database.
configure(style, query_opt=None, {}kw)~
Query or set the default value of the specified option(s) in {style}.
Each key in {kw} is an option and each value is a string identifying
the value for that option.
For example, to change every default button to be a flat button with some
padding and a different background color do:: >
import ttk
import Tkinter
root = Tkinter.Tk()
ttk.Style().configure("TButton", padding=6, relief="flat",
background="#ccc")
btn = ttk.Button(text="Sample")
btn.pack()
root.mainloop()
<
map(style, query_opt=None, {}kw)~
Query or sets dynamic values of the specified option(s) in {style}.
Each key in {kw} is an option and each value should be a list or a
tuple (usually) containing statespecs grouped in tuples, lists, or
something else of your preference. A statespec is a compound of one
or more states and then a value.
An example:: >
import Tkinter
import ttk
root = Tkinter.Tk()
style = ttk.Style()
style.map("C.TButton",
foreground=[('pressed', 'red'), ('active', 'blue')],
background=[('pressed', '!disabled', 'black'), ('active', 'white')]
)
colored_btn = ttk.Button(text="Test", style="C.TButton").pack()
root.mainloop()
<
Note that the order of the (states, value) sequences for an
option matters. In the previous example, if you change the
order to ``[('active', 'blue'), ('pressed', 'red')]`` in the
foreground option, for example, you would get a blue foreground
when the widget is in the active or pressed states.
lookup(style, option[, state=None[, default=None]])~
Returns the value specified for {option} in {style}.
If {state} is specified, it is expected to be a sequence of one or more
states. If the {default} argument is set, it is used as a fallback value
in case no specification for option is found.
To check what font a Button uses by default, do:: >
import ttk
print ttk.Style().lookup("TButton", "font")
<
layout(style[, layoutspec=None])~
Define the widget layout for given {style}. If {layoutspec} is omitted,
return the layout specification for given style.
{layoutspec}, if specified, is expected to be a list or some other
sequence type (excluding strings), where each item should be a tuple and
the first item is the layout name and the second item should have the
format described described in `Layouts`_.
To understand the format, see the following example (it is not
intended to do anything useful):: >
import ttk
import Tkinter
root = Tkinter.Tk()
style = ttk.Style()
style.layout("TMenubutton", [
("Menubutton.background", None),
("Menubutton.button", {"children":
[("Menubutton.focus", {"children":
[("Menubutton.padding", {"children":
[("Menubutton.label", {"side": "left", "expand": 1})]
})]
})]
}),
])
mbtn = ttk.Menubutton(text='Text')
mbtn.pack()
root.mainloop()
<
element_create(elementname, etype, {args, }*kw)~
Create a new element in the current theme, of the given {etype} which is
expected to be either "image", "from" or "vsapi". The latter is only
available in Tk 8.6a for Windows XP and Vista and is not described here.
If "image" is used, {args} should contain the default image name followed
by statespec/value pairs (this is the imagespec), and {kw} may have the
following options:
* border=padding
padding is a list of up to four integers, specifying the left, top,
right, and bottom borders, respectively.
* height=height
Specifies a minimum height for the element. If less than zero, the
base image's height is used as a default.
* padding=padding
Specifies the element's interior padding. Defaults to border's value
if not specified.
* sticky=spec
Specifies how the image is placed within the final parcel. spec
contains zero or more characters “n”, “s”, “w”, or “e”.
* width=width
Specifies a minimum width for the element. If less than zero, the
base image's width is used as a default.
If "from" is used as the value of {etype},
element_create will clone an existing
element. {args} is expected to contain a themename, from which
the element will be cloned, and optionally an element to clone from.
If this element to clone from is not specified, an empty element will
be used. {kw} is discarded.
element_names()~
Returns the list of elements defined in the current theme.
element_options(elementname)~
Returns the list of {elementname}'s options.
theme_create(themename[, parent=None[, settings=None]])~
Create a new theme.
It is an error if {themename} already exists. If {parent} is specified,
the new theme will inherit styles, elements and layouts from the parent
theme. If {settings} are present they are expected to have the same
syntax used for theme_settings.
theme_settings(themename, settings)~
Temporarily sets the current theme to {themename}, apply specified
{settings} and then restore the previous theme.
Each key in {settings} is a style and each value may contain the keys
'configure', 'map', 'layout' and 'element create' and they are expected
to have the same format as specified by the methods
Style.configure, Style.map, Style.layout and
Style.element_create respectively.
As an example, let's change the Combobox for the default theme a bit:: >
import ttk
import Tkinter
root = Tkinter.Tk()
style = ttk.Style()
style.theme_settings("default", {
"TCombobox": {
"configure": {"padding": 5},
"map": {
"background": [("active", "green2"),
("!disabled", "green4")],
"fieldbackground": [("!disabled", "green3")],
"foreground": [("focus", "OliveDrab1"),
("!disabled", "OliveDrab2")]
}
}
})
combo = ttk.Combobox().pack()
root.mainloop()
<
theme_names()~
Returns a list of all known themes.
theme_use([themename])~
If {themename} is not given, returns the theme in use. Otherwise, sets
the current theme to {themename}, refreshes all widgets and emits a
<<ThemeChanged>> event.
Layouts
^^^^^^^
A layout can be just None, if it takes no options, or a dict of
options specifying how to arrange the element. The layout mechanism
uses a simplified version of the pack geometry manager: given an
initial cavity, each element is allocated a parcel. Valid
options/values are:
* side: whichside
Specifies which side of the cavity to place the element; one of
top, right, bottom or left. If omitted, the element occupies the
entire cavity.
* sticky: nswe
Specifies where the element is placed inside its allocated parcel.
* unit: 0 or 1
If set to 1, causes the element and all of its descendants to be treated as
a single element for the purposes of Widget.identify et al. It's
used for things like scrollbar thumbs with grips.
* children: [sublayout... ]
Specifies a list of elements to place inside the element. Each
element is a tuple (or other sequence type) where the first item is
the layout name, and the other is a `Layout`_.
`Layouts`_
==============================================================================
*py2stdlib-tty*
tty~
:platform: Unix
:synopsis: Utility functions that perform common terminal control operations.
The tty (|py2stdlib-tty|) module defines functions for putting the tty into cbreak and raw
modes.
Because it requires the termios (|py2stdlib-termios|) module, it will work only on Unix.
The tty (|py2stdlib-tty|) module defines the following functions:
setraw(fd[, when])~
Change the mode of the file descriptor {fd} to raw. If {when} is omitted, it
defaults to termios.TCSAFLUSH, and is passed to
termios.tcsetattr.
setcbreak(fd[, when])~
Change the mode of file descriptor {fd} to cbreak. If {when} is omitted, it
defaults to termios.TCSAFLUSH, and is passed to
termios.tcsetattr.
.. seealso::
Module termios (|py2stdlib-termios|)
Low-level terminal control interface.
==============================================================================
*py2stdlib-turtle*
turtle~
:synopsis: Turtle graphics for Tk
.. testsetup:: default
from turtle import *
turtle = Turtle()
Introduction
============
Turtle graphics is a popular way for introducing programming to kids. It was
part of the original Logo programming language developed by Wally Feurzig and
Seymour Papert in 1966.
Imagine a robotic turtle starting at (0, 0) in the x-y plane. Give it the
command ``turtle.forward(15)``, and it moves (on-screen!) 15 pixels in the
direction it is facing, drawing a line as it moves. Give it the command
``turtle.left(25)``, and it rotates in-place 25 degrees clockwise.
By combining together these and similar commands, intricate shapes and pictures
can easily be drawn.
The turtle (|py2stdlib-turtle|) module is an extended reimplementation of the same-named
module from the Python standard distribution up to version Python 2.5.
It tries to keep the merits of the old turtle module and to be (nearly) 100%
compatible with it. This means in the first place to enable the learning
programmer to use all the commands, classes and methods interactively when using
the module from within IDLE run with the ``-n`` switch.
The turtle module provides turtle graphics primitives, in both object-oriented
and procedure-oriented ways. Because it uses Tkinter (|py2stdlib-tkinter|) for the underlying
graphics, it needs a version of Python installed with Tk support.
The object-oriented interface uses essentially two+two classes:
1. The TurtleScreen class defines graphics windows as a playground for
the drawing turtles. Its constructor needs a Tkinter.Canvas or a
ScrolledCanvas as argument. It should be used when turtle (|py2stdlib-turtle|) is
used as part of some application.
The function Screen returns a singleton object of a
TurtleScreen subclass. This function should be used when
turtle (|py2stdlib-turtle|) is used as a standalone tool for doing graphics.
As a singleton object, inheriting from its class is not possible.
All methods of TurtleScreen/Screen also exist as functions, i.e. as part of
the procedure-oriented interface.
2. RawTurtle (alias: RawPen) defines Turtle objects which draw
on a TurtleScreen. Its constructor needs a Canvas, ScrolledCanvas
or TurtleScreen as argument, so the RawTurtle objects know where to draw.
Derived from RawTurtle is the subclass Turtle (alias: Pen),
which draws on "the" Screen - instance which is automatically
created, if not already present.
All methods of RawTurtle/Turtle also exist as functions, i.e. part of the
procedure-oriented interface.
The procedural interface provides functions which are derived from the methods
of the classes Screen and Turtle. They have the same names as
the corresponding methods. A screen object is automatically created whenever a
function derived from a Screen method is called. An (unnamed) turtle object is
automatically created whenever any of the functions derived from a Turtle method
is called.
To use multiple turtles an a screen one has to use the object-oriented interface.
.. note::
In the following documentation the argument list for functions is given.
Methods, of course, have the additional first argument {self} which is
omitted here.
Overview over available Turtle and Screen methods
=================================================
Turtle methods
--------------
Turtle motion
Move and draw
| forward | fd
| backward | bk | back
| right | rt
| left | lt
| goto | setpos | setposition
| setx
| sety
| setheading | seth
| home
| circle
| dot
| stamp
| clearstamp
| clearstamps
| undo
| speed
Tell Turtle's state
| position | pos
| towards
| xcor
| ycor
| heading
| distance
Setting and measurement
| degrees
| radians
Pen control
Drawing state
| pendown | pd | down
| penup | pu | up
| pensize | width
| pen
| isdown
Color control
| color
| pencolor
| fillcolor
Filling
| fill
| begin_fill
| end_fill
More drawing control
| reset
| clear
| write
Turtle state
Visibility
| showturtle | st
| hideturtle | ht
| isvisible
Appearance
| shape
| resizemode
| shapesize | turtlesize
| settiltangle
| tiltangle
| tilt
Using events
| onclick
| onrelease
| ondrag
Special Turtle methods
| begin_poly
| end_poly
| get_poly
| clone
| getturtle | getpen
| getscreen
| setundobuffer
| undobufferentries
| tracer
| window_width
| window_height
Methods of TurtleScreen/Screen
------------------------------
Window control
| bgcolor
| bgpic
| clear | clearscreen
| reset | resetscreen
| screensize
| setworldcoordinates
Animation control
| delay
| tracer
| update
Using screen events
| listen
| onkey
| onclick | onscreenclick
| ontimer
Settings and special methods
| mode
| colormode
| getcanvas
| getshapes
| register_shape | addshape
| turtles
| window_height
| window_width
Methods specific to Screen
| bye
| exitonclick
| setup
| title
Methods of RawTurtle/Turtle and corresponding functions
=======================================================
Most of the examples in this section refer to a Turtle instance called
``turtle``.
Turtle motion
-------------
forward(distance)~
fd(distance)
:param distance: a number (integer or float)
Move the turtle forward by the specified {distance}, in the direction the
turtle is headed.
.. doctest:: >
>>> turtle.position()
(0.00,0.00)
>>> turtle.forward(25)
>>> turtle.position()
(25.00,0.00)
>>> turtle.forward(-75)
>>> turtle.position()
(-50.00,0.00)
<
back(distance)~
bk(distance)
backward(distance)
:param distance: a number
Move the turtle backward by {distance}, opposite to the direction the
turtle is headed. Do not change the turtle's heading.
.. doctest::
:hide:
>>> turtle.goto(0, 0)
.. doctest:: >
>>> turtle.position()
(0.00,0.00)
>>> turtle.backward(30)
>>> turtle.position()
(-30.00,0.00)
<
right(angle)~
rt(angle)
:param angle: a number (integer or float)
Turn turtle right by {angle} units. (Units are by default degrees, but
can be set via the degrees and radians functions.) Angle
orientation depends on the turtle mode, see mode.
.. doctest::
:hide:
>>> turtle.setheading(22)
.. doctest:: >
>>> turtle.heading()
22.0
>>> turtle.right(45)
>>> turtle.heading()
337.0
<
left(angle)~
lt(angle)
:param angle: a number (integer or float)
Turn turtle left by {angle} units. (Units are by default degrees, but
can be set via the degrees and radians functions.) Angle
orientation depends on the turtle mode, see mode.
.. doctest::
:hide:
>>> turtle.setheading(22)
.. doctest:: >
>>> turtle.heading()
22.0
>>> turtle.left(45)
>>> turtle.heading()
67.0
<
goto(x, y=None)~
setpos(x, y=None)
setposition(x, y=None)
:param x: a number or a pair/vector of numbers
:param y: a number or ``None``
If {y} is ``None``, {x} must be a pair of coordinates or a Vec2D
(e.g. as returned by pos).
Move turtle to an absolute position. If the pen is down, draw line. Do
not change the turtle's orientation.
.. doctest::
:hide:
>>> turtle.goto(0, 0)
.. doctest:: >
>>> tp = turtle.pos()
>>> tp
(0.00,0.00)
>>> turtle.setpos(60,30)
>>> turtle.pos()
(60.00,30.00)
>>> turtle.setpos((20,80))
>>> turtle.pos()
(20.00,80.00)
>>> turtle.setpos(tp)
>>> turtle.pos()
(0.00,0.00)
<
setx(x)~
:param x: a number (integer or float)
Set the turtle's first coordinate to {x}, leave second coordinate
unchanged.
.. doctest::
:hide:
>>> turtle.goto(0, 240)
.. doctest:: >
>>> turtle.position()
(0.00,240.00)
>>> turtle.setx(10)
>>> turtle.position()
(10.00,240.00)
<
sety(y)~
:param y: a number (integer or float)
Set the turtle's second coordinate to {y}, leave first coordinate unchanged.
.. doctest::
:hide:
>>> turtle.goto(0, 40)
.. doctest:: >
>>> turtle.position()
(0.00,40.00)
>>> turtle.sety(-10)
>>> turtle.position()
(0.00,-10.00)
<
setheading(to_angle)~
seth(to_angle)
:param to_angle: a number (integer or float)
Set the orientation of the turtle to {to_angle}. Here are some common
directions in degrees:
=================== ====================
standard mode logo mode
=================== ====================
0 - east 0 - north
90 - north 90 - east
180 - west 180 - south
270 - south 270 - west
=================== ====================
.. doctest:: >
>>> turtle.setheading(90)
>>> turtle.heading()
90.0
<
home()~
Move turtle to the origin -- coordinates (0,0) -- and set its heading to
its start-orientation (which depends on the mode, see mode).
.. doctest::
:hide:
>>> turtle.setheading(90)
>>> turtle.goto(0, -10)
.. doctest:: >
>>> turtle.heading()
90.0
>>> turtle.position()
(0.00,-10.00)
>>> turtle.home()
>>> turtle.position()
(0.00,0.00)
>>> turtle.heading()
0.0
<
circle(radius, extent=None, steps=None)~
:param radius: a number
:param extent: a number (or ``None``)
:param steps: an integer (or ``None``)
Draw a circle with given {radius}. The center is {radius} units left of
the turtle; {extent} -- an angle -- determines which part of the circle
is drawn. If {extent} is not given, draw the entire circle. If {extent}
is not a full circle, one endpoint of the arc is the current pen
position. Draw the arc in counterclockwise direction if {radius} is
positive, otherwise in clockwise direction. Finally the direction of the
turtle is changed by the amount of {extent}.
As the circle is approximated by an inscribed regular polygon, {steps}
determines the number of steps to use. If not given, it will be
calculated automatically. May be used to draw regular polygons.
.. doctest:: >
>>> turtle.home()
>>> turtle.position()
(0.00,0.00)
>>> turtle.heading()
0.0
>>> turtle.circle(50)
>>> turtle.position()
(-0.00,0.00)
>>> turtle.heading()
0.0
>>> turtle.circle(120, 180) # draw a semicircle
>>> turtle.position()
(0.00,240.00)
>>> turtle.heading()
180.0
<
dot(size=None, *color)~
:param size: an integer >= 1 (if given)
:param color: a colorstring or a numeric color tuple
Draw a circular dot with diameter {size}, using {color}. If {size} is
not given, the maximum of pensize+4 and 2*pensize is used.
.. doctest:: >
>>> turtle.home()
>>> turtle.dot()
>>> turtle.fd(50); turtle.dot(20, "blue"); turtle.fd(50)
>>> turtle.position()
(100.00,-0.00)
>>> turtle.heading()
0.0
<
stamp()~
Stamp a copy of the turtle shape onto the canvas at the current turtle
position. Return a stamp_id for that stamp, which can be used to delete
it by calling ``clearstamp(stamp_id)``.
.. doctest:: >
>>> turtle.color("blue")
>>> turtle.stamp()
11
>>> turtle.fd(50)
<
clearstamp(stampid)~
:param stampid: an integer, must be return value of previous
stamp call
Delete stamp with given {stampid}.
.. doctest:: >
>>> turtle.position()
(150.00,-0.00)
>>> turtle.color("blue")
>>> astamp = turtle.stamp()
>>> turtle.fd(50)
>>> turtle.position()
(200.00,-0.00)
>>> turtle.clearstamp(astamp)
>>> turtle.position()
(200.00,-0.00)
<
clearstamps(n=None)~
:param n: an integer (or ``None``)
Delete all or first/last {n} of turtle's stamps. If {n} is None, delete
all stamps, if {n} > 0 delete first {n} stamps, else if {n} < 0 delete
last {n} stamps.
.. doctest:: >
>>> for i in range(8):
... turtle.stamp(); turtle.fd(30)
13
14
15
16
17
18
19
20
>>> turtle.clearstamps(2)
>>> turtle.clearstamps(-2)
>>> turtle.clearstamps()
<
undo()~
Undo (repeatedly) the last turtle action(s). Number of available
undo actions is determined by the size of the undobuffer.
.. doctest:: >
>>> for i in range(4):
... turtle.fd(50); turtle.lt(80)
...
>>> for i in range(8):
... turtle.undo()
<
speed(speed=None)~
:param speed: an integer in the range 0..10 or a speedstring (see below)
Set the turtle's speed to an integer value in the range 0..10. If no
argument is given, return current speed.
If input is a number greater than 10 or smaller than 0.5, speed is set
to 0. Speedstrings are mapped to speedvalues as follows:
* "fastest": 0
* "fast": 10
* "normal": 6
* "slow": 3
* "slowest": 1
Speeds from 1 to 10 enforce increasingly faster animation of line drawing
and turtle turning.
Attention: {speed} = 0 means that {no} animation takes
place. forward/back makes turtle jump and likewise left/right make the
turtle turn instantly.
.. doctest:: >
>>> turtle.speed()
3
>>> turtle.speed('normal')
>>> turtle.speed()
6
>>> turtle.speed(9)
>>> turtle.speed()
9
<
Tell Turtle's state
position()~
pos()
Return the turtle's current location (x,y) (as a Vec2D vector).
.. doctest:: >
>>> turtle.pos()
(440.00,-0.00)
<
towards(x, y=None)~
:param x: a number or a pair/vector of numbers or a turtle instance
:param y: a number if {x} is a number, else ``None``
Return the angle between the line from turtle position to position specified
by (x,y), the vector or the other turtle. This depends on the turtle's start
orientation which depends on the mode - "standard"/"world" or "logo").
.. doctest:: >
>>> turtle.goto(10, 10)
>>> turtle.towards(0,0)
225.0
<
xcor()~
Return the turtle's x coordinate.
.. doctest:: >
>>> turtle.home()
>>> turtle.left(50)
>>> turtle.forward(100)
>>> turtle.pos()
(64.28,76.60)
>>> print turtle.xcor()
64.2787609687
<
ycor()~
Return the turtle's y coordinate.
.. doctest:: >
>>> turtle.home()
>>> turtle.left(60)
>>> turtle.forward(100)
>>> print turtle.pos()
(50.00,86.60)
>>> print turtle.ycor()
86.6025403784
<
heading()~
Return the turtle's current heading (value depends on the turtle mode, see
mode).
.. doctest:: >
>>> turtle.home()
>>> turtle.left(67)
>>> turtle.heading()
67.0
<
distance(x, y=None)~
:param x: a number or a pair/vector of numbers or a turtle instance
:param y: a number if {x} is a number, else ``None``
Return the distance from the turtle to (x,y), the given vector, or the given
other turtle, in turtle step units.
.. doctest:: >
>>> turtle.home()
>>> turtle.distance(30,40)
50.0
>>> turtle.distance((30,40))
50.0
>>> joe = Turtle()
>>> joe.forward(77)
>>> turtle.distance(joe)
77.0
<
Settings for measurement
degrees(fullcircle=360.0)~
:param fullcircle: a number
Set angle measurement units, i.e. set number of "degrees" for a full circle.
Default value is 360 degrees.
.. doctest:: >
>>> turtle.home()
>>> turtle.left(90)
>>> turtle.heading()
90.0
>>> turtle.degrees(400.0) # angle measurement in gon
>>> turtle.heading()
100.0
>>> turtle.degrees(360)
>>> turtle.heading()
90.0
<
radians()~
Set the angle measurement units to radians. Equivalent to
``degrees(2*math.pi)``.
.. doctest:: >
>>> turtle.home()
>>> turtle.left(90)
>>> turtle.heading()
90.0
>>> turtle.radians()
>>> turtle.heading()
1.5707963267948966
<
.. doctest::
:hide:
>>> turtle.degrees(360)
Pen control
-----------
Drawing state
~~~~~~~~~~~~~
pendown()~
pd()
down()
Pull the pen down -- drawing when moving.
penup()~
pu()
up()
Pull the pen up -- no drawing when moving.
pensize(width=None)~
width(width=None)
:param width: a positive number
Set the line thickness to {width} or return it. If resizemode is set to
"auto" and turtleshape is a polygon, that polygon is drawn with the same line
thickness. If no argument is given, the current pensize is returned.
.. doctest:: >
>>> turtle.pensize()
1
>>> turtle.pensize(10) # from here on lines of width 10 are drawn
<
pen(pen=None, {}pendict)~
:param pen: a dictionary with some or all of the below listed keys
:param pendict: one or more keyword-arguments with the below listed keys as keywords
Return or set the pen's attributes in a "pen-dictionary" with the following
key/value pairs:
* "shown": True/False
* "pendown": True/False
* "pencolor": color-string or color-tuple
* "fillcolor": color-string or color-tuple
* "pensize": positive number
* "speed": number in range 0..10
* "resizemode": "auto" or "user" or "noresize"
* "stretchfactor": (positive number, positive number)
* "outline": positive number
* "tilt": number
This dictionary can be used as argument for a subsequent call to pen
to restore the former pen-state. Moreover one or more of these attributes
can be provided as keyword-arguments. This can be used to set several pen
attributes in one statement.
.. doctest::
:options: +NORMALIZE_WHITESPACE
>>> turtle.pen(fillcolor="black", pencolor="red", pensize=10)
>>> sorted(turtle.pen().items())
[('fillcolor', 'black'), ('outline', 1), ('pencolor', 'red'),
('pendown', True), ('pensize', 10), ('resizemode', 'noresize'),
('shown', True), ('speed', 9), ('stretchfactor', (1, 1)), ('tilt', 0)]
>>> penstate=turtle.pen()
>>> turtle.color("yellow", "")
>>> turtle.penup()
>>> sorted(turtle.pen().items())
[('fillcolor', ''), ('outline', 1), ('pencolor', 'yellow'),
('pendown', False), ('pensize', 10), ('resizemode', 'noresize'),
('shown', True), ('speed', 9), ('stretchfactor', (1, 1)), ('tilt', 0)]
>>> turtle.pen(penstate, fillcolor="green")
>>> sorted(turtle.pen().items())
[('fillcolor', 'green'), ('outline', 1), ('pencolor', 'red'),
('pendown', True), ('pensize', 10), ('resizemode', 'noresize'),
('shown', True), ('speed', 9), ('stretchfactor', (1, 1)), ('tilt', 0)]
isdown()~
Return ``True`` if pen is down, ``False`` if it's up.
.. doctest:: >
>>> turtle.penup()
>>> turtle.isdown()
False
>>> turtle.pendown()
>>> turtle.isdown()
True
<
Color control
pencolor(*args)~
Return or set the pencolor.
Four input formats are allowed:
``pencolor()``
Return the current pencolor as color specification string or
as a tuple (see example). May be used as input to another
color/pencolor/fillcolor call.
``pencolor(colorstring)``
Set pencolor to {colorstring}, which is a Tk color specification string,
such as ``"red"``, ``"yellow"``, or ``"#33cc8c"``.
``pencolor((r, g, b))``
Set pencolor to the RGB color represented by the tuple of {r}, {g}, and
{b}. Each of {r}, {g}, and {b} must be in the range 0..colormode, where
colormode is either 1.0 or 255 (see colormode).
``pencolor(r, g, b)``
Set pencolor to the RGB color represented by {r}, {g}, and {b}. Each of
{r}, {g}, and {b} must be in the range 0..colormode.
If turtleshape is a polygon, the outline of that polygon is drawn with the
newly set pencolor.
.. doctest:: >
>>> colormode()
1.0
>>> turtle.pencolor()
'red'
>>> turtle.pencolor("brown")
>>> turtle.pencolor()
'brown'
>>> tup = (0.2, 0.8, 0.55)
>>> turtle.pencolor(tup)
>>> turtle.pencolor()
(0.2, 0.8, 0.5490196078431373)
>>> colormode(255)
>>> turtle.pencolor()
(51, 204, 140)
>>> turtle.pencolor('#32c18f')
>>> turtle.pencolor()
(50, 193, 143)
<
fillcolor(*args)~
Return or set the fillcolor.
Four input formats are allowed:
``fillcolor()``
Return the current fillcolor as color specification string, possibly
in tuple format (see example). May be used as input to another
color/pencolor/fillcolor call.
``fillcolor(colorstring)``
Set fillcolor to {colorstring}, which is a Tk color specification string,
such as ``"red"``, ``"yellow"``, or ``"#33cc8c"``.
``fillcolor((r, g, b))``
Set fillcolor to the RGB color represented by the tuple of {r}, {g}, and
{b}. Each of {r}, {g}, and {b} must be in the range 0..colormode, where
colormode is either 1.0 or 255 (see colormode).
``fillcolor(r, g, b)``
Set fillcolor to the RGB color represented by {r}, {g}, and {b}. Each of
{r}, {g}, and {b} must be in the range 0..colormode.
If turtleshape is a polygon, the interior of that polygon is drawn
with the newly set fillcolor.
.. doctest:: >
>>> turtle.fillcolor("violet")
>>> turtle.fillcolor()
'violet'
>>> col = turtle.pencolor()
>>> col
(50, 193, 143)
>>> turtle.fillcolor(col)
>>> turtle.fillcolor()
(50, 193, 143)
>>> turtle.fillcolor('#ffffff')
>>> turtle.fillcolor()
(255, 255, 255)
<
color(*args)~
Return or set pencolor and fillcolor.
Several input formats are allowed. They use 0 to 3 arguments as
follows:
``color()``
Return the current pencolor and the current fillcolor as a pair of color
specification strings or tuples as returned by pencolor and
fillcolor.
``color(colorstring)``, ``color((r,g,b))``, ``color(r,g,b)``
Inputs as in pencolor, set both, fillcolor and pencolor, to the
given value.
``color(colorstring1, colorstring2)``, ``color((r1,g1,b1), (r2,g2,b2))``
Equivalent to ``pencolor(colorstring1)`` and ``fillcolor(colorstring2)``
and analogously if the other input format is used.
If turtleshape is a polygon, outline and interior of that polygon is drawn
with the newly set colors.
.. doctest:: >
>>> turtle.color("red", "green")
>>> turtle.color()
('red', 'green')
>>> color("#285078", "#a0c8f0")
>>> color()
((40, 80, 120), (160, 200, 240))
<
See also: Screen method colormode.
Filling
~~~~~~~
.. doctest::
:hide:
>>> turtle.home()
fill(flag)~
:param flag: True/False (or 1/0 respectively)
Call ``fill(True)`` before drawing the shape you want to fill, and
``fill(False)`` when done. When used without argument: return fillstate
(``True`` if filling, ``False`` else).
.. doctest:: >
>>> turtle.fill(True)
>>> for _ in range(3):
... turtle.forward(100)
... turtle.left(120)
...
>>> turtle.fill(False)
<
begin_fill()~
Call just before drawing a shape to be filled. Equivalent to ``fill(True)``.
end_fill()~
Fill the shape drawn after the last call to begin_fill. Equivalent
to ``fill(False)``.
.. doctest:: >
>>> turtle.color("black", "red")
>>> turtle.begin_fill()
>>> turtle.circle(80)
>>> turtle.end_fill()
<
More drawing control
reset()~
Delete the turtle's drawings from the screen, re-center the turtle and set
variables to the default values.
.. doctest:: >
>>> turtle.goto(0,-22)
>>> turtle.left(100)
>>> turtle.position()
(0.00,-22.00)
>>> turtle.heading()
100.0
>>> turtle.reset()
>>> turtle.position()
(0.00,0.00)
>>> turtle.heading()
0.0
<
clear()~
Delete the turtle's drawings from the screen. Do not move turtle. State and
position of the turtle as well as drawings of other turtles are not affected.
write(arg, move=False, align="left", font=("Arial", 8, "normal"))~
:param arg: object to be written to the TurtleScreen
:param move: True/False
:param align: one of the strings "left", "center" or right"
:param font: a triple (fontname, fontsize, fonttype)
Write text - the string representation of {arg} - at the current turtle
position according to {align} ("left", "center" or right") and with the given
font. If {move} is True, the pen is moved to the bottom-right corner of the
text. By default, {move} is False.
>>> turtle.write("Home = ", True, align="center")
>>> turtle.write((0,0), True)
Turtle state
------------
Visibility
~~~~~~~~~~
hideturtle()~
ht()
Make the turtle invisible. It's a good idea to do this while you're in the
middle of doing some complex drawing, because hiding the turtle speeds up the
drawing observably.
.. doctest:: >
>>> turtle.hideturtle()
<
showturtle()~
st()
Make the turtle visible.
.. doctest:: >
>>> turtle.showturtle()
<
isvisible()~
Return True if the Turtle is shown, False if it's hidden.
>>> turtle.hideturtle()
>>> turtle.isvisible()
False
>>> turtle.showturtle()
>>> turtle.isvisible()
True
Appearance
~~~~~~~~~~
shape(name=None)~
:param name: a string which is a valid shapename
Set turtle shape to shape with given {name} or, if name is not given, return
name of current shape. Shape with {name} must exist in the TurtleScreen's
shape dictionary. Initially there are the following polygon shapes: "arrow",
"turtle", "circle", "square", "triangle", "classic". To learn about how to
deal with shapes see Screen method register_shape.
.. doctest:: >
>>> turtle.shape()
'classic'
>>> turtle.shape("turtle")
>>> turtle.shape()
'turtle'
<
resizemode(rmode=None)~
:param rmode: one of the strings "auto", "user", "noresize"
Set resizemode to one of the values: "auto", "user", "noresize". If {rmode}
is not given, return current resizemode. Different resizemodes have the
following effects:
- "auto": adapts the appearance of the turtle corresponding to the value of pensize.
- "user": adapts the appearance of the turtle according to the values of
stretchfactor and outlinewidth (outline), which are set by
shapesize.
- "noresize": no adaption of the turtle's appearance takes place.
resizemode("user") is called by shapesize when used with arguments.
.. doctest:: >
>>> turtle.resizemode()
'noresize'
>>> turtle.resizemode("auto")
>>> turtle.resizemode()
'auto'
<
shapesize(stretch_wid=None, stretch_len=None, outline=None)~
turtlesize(stretch_wid=None, stretch_len=None, outline=None)
:param stretch_wid: positive number
:param stretch_len: positive number
:param outline: positive number
Return or set the pen's attributes x/y-stretchfactors and/or outline. Set
resizemode to "user". If and only if resizemode is set to "user", the turtle
will be displayed stretched according to its stretchfactors: {stretch_wid} is
stretchfactor perpendicular to its orientation, {stretch_len} is
stretchfactor in direction of its orientation, {outline} determines the width
of the shapes's outline.
.. doctest:: >
>>> turtle.shapesize()
(1, 1, 1)
>>> turtle.resizemode("user")
>>> turtle.shapesize(5, 5, 12)
>>> turtle.shapesize()
(5, 5, 12)
>>> turtle.shapesize(outline=8)
>>> turtle.shapesize()
(5, 5, 8)
<
tilt(angle)~
:param angle: a number
Rotate the turtleshape by {angle} from its current tilt-angle, but do {not}
change the turtle's heading (direction of movement).
.. doctest:: >
>>> turtle.reset()
>>> turtle.shape("circle")
>>> turtle.shapesize(5,2)
>>> turtle.tilt(30)
>>> turtle.fd(50)
>>> turtle.tilt(30)
>>> turtle.fd(50)
<
settiltangle(angle)~
:param angle: a number
Rotate the turtleshape to point in the direction specified by {angle},
regardless of its current tilt-angle. {Do not} change the turtle's heading
(direction of movement).
.. doctest:: >
>>> turtle.reset()
>>> turtle.shape("circle")
>>> turtle.shapesize(5,2)
>>> turtle.settiltangle(45)
>>> turtle.fd(50)
>>> turtle.settiltangle(-45)
>>> turtle.fd(50)
<
tiltangle()~
Return the current tilt-angle, i.e. the angle between the orientation of the
turtleshape and the heading of the turtle (its direction of movement).
.. doctest:: >
>>> turtle.reset()
>>> turtle.shape("circle")
>>> turtle.shapesize(5,2)
>>> turtle.tilt(45)
>>> turtle.tiltangle()
45.0
<
Using events
onclick(fun, btn=1, add=None)~
:param fun: a function with two arguments which will be called with the
coordinates of the clicked point on the canvas
:param num: number of the mouse-button, defaults to 1 (left mouse button)
:param add: ``True`` or ``False`` -- if ``True``, a new binding will be
added, otherwise it will replace a former binding
Bind {fun} to mouse-click events on this turtle. If {fun} is ``None``,
existing bindings are removed. Example for the anonymous turtle, i.e. the
procedural way:
.. doctest:: >
>>> def turn(x, y):
... left(180)
...
>>> onclick(turn) # Now clicking into the turtle will turn it.
>>> onclick(None) # event-binding will be removed
<
onrelease(fun, btn=1, add=None)~
:param fun: a function with two arguments which will be called with the
coordinates of the clicked point on the canvas
:param num: number of the mouse-button, defaults to 1 (left mouse button)
:param add: ``True`` or ``False`` -- if ``True``, a new binding will be
added, otherwise it will replace a former binding
Bind {fun} to mouse-button-release events on this turtle. If {fun} is
``None``, existing bindings are removed.
.. doctest:: >
>>> class MyTurtle(Turtle):
... def glow(self,x,y):
... self.fillcolor("red")
... def unglow(self,x,y):
... self.fillcolor("")
...
>>> turtle = MyTurtle()
>>> turtle.onclick(turtle.glow) # clicking on turtle turns fillcolor red,
>>> turtle.onrelease(turtle.unglow) # releasing turns it to transparent.
<
ondrag(fun, btn=1, add=None)~
:param fun: a function with two arguments which will be called with the
coordinates of the clicked point on the canvas
:param num: number of the mouse-button, defaults to 1 (left mouse button)
:param add: ``True`` or ``False`` -- if ``True``, a new binding will be
added, otherwise it will replace a former binding
Bind {fun} to mouse-move events on this turtle. If {fun} is ``None``,
existing bindings are removed.
Remark: Every sequence of mouse-move-events on a turtle is preceded by a
mouse-click event on that turtle.
.. doctest:: >
>>> turtle.ondrag(turtle.goto)
<
Subsequently, clicking and dragging the Turtle will move it across
the screen thereby producing handdrawings (if pen is down).
Special Turtle methods
----------------------
begin_poly()~
Start recording the vertices of a polygon. Current turtle position is first
vertex of polygon.
end_poly()~
Stop recording the vertices of a polygon. Current turtle position is last
vertex of polygon. This will be connected with the first vertex.
get_poly()~
Return the last recorded polygon.
.. doctest:: >
>>> turtle.home()
>>> turtle.begin_poly()
>>> turtle.fd(100)
>>> turtle.left(20)
>>> turtle.fd(30)
>>> turtle.left(60)
>>> turtle.fd(50)
>>> turtle.end_poly()
>>> p = turtle.get_poly()
>>> register_shape("myFavouriteShape", p)
<
clone()~
Create and return a clone of the turtle with same position, heading and
turtle properties.
.. doctest:: >
>>> mick = Turtle()
>>> joe = mick.clone()
<
getturtle()~
getpen()
Return the Turtle object itself. Only reasonable use: as a function to
return the "anonymous turtle":
.. doctest:: >
>>> pet = getturtle()
>>> pet.fd(50)
>>> pet
<turtle.Turtle object at 0x...>
<
getscreen()~
Return the TurtleScreen object the turtle is drawing on.
TurtleScreen methods can then be called for that object.
.. doctest:: >
>>> ts = turtle.getscreen()
>>> ts
<turtle._Screen object at 0x...>
>>> ts.bgcolor("pink")
<
setundobuffer(size)~
:param size: an integer or ``None``
Set or disable undobuffer. If {size} is an integer an empty undobuffer of
given size is installed. {size} gives the maximum number of turtle actions
that can be undone by the undo method/function. If {size} is
``None``, the undobuffer is disabled.
.. doctest:: >
>>> turtle.setundobuffer(42)
<
undobufferentries()~
Return number of entries in the undobuffer.
.. doctest:: >
>>> while undobufferentries():
... undo()
<
tracer(flag=None, delay=None)~
A replica of the corresponding TurtleScreen method.
2.6~
window_width()~
window_height()
Both are replicas of the corresponding TurtleScreen methods.
2.6~
Excursus about the use of compound shapes
-----------------------------------------
To use compound turtle shapes, which consist of several polygons of different
color, you must use the helper class Shape explicitly as described
below:
1. Create an empty Shape object of type "compound".
2. Add as many components to this object as desired, using the
addcomponent method.
For example:
.. doctest:: >
>>> s = Shape("compound")
>>> poly1 = ((0,0),(10,-5),(0,10),(-10,-5))
>>> s.addcomponent(poly1, "red", "blue")
>>> poly2 = ((0,0),(10,-5),(-10,-5))
>>> s.addcomponent(poly2, "blue", "red")
<
3. Now add the Shape to the Screen's shapelist and use it:
.. doctest:: >
>>> register_shape("myshape", s)
>>> shape("myshape")
<
.. note::
The Shape class is used internally by the register_shape
method in different ways. The application programmer has to deal with the
Shape class {only} when using compound shapes like shown above!
Methods of TurtleScreen/Screen and corresponding functions
==========================================================
Most of the examples in this section refer to a TurtleScreen instance called
``screen``.
.. doctest::
:hide:
>>> screen = Screen()
Window control
--------------
bgcolor(*args)~
:param args: a color string or three numbers in the range 0..colormode or a
3-tuple of such numbers
Set or return background color of the TurtleScreen.
.. doctest:: >
>>> screen.bgcolor("orange")
>>> screen.bgcolor()
'orange'
>>> screen.bgcolor("#800080")
>>> screen.bgcolor()
(128, 0, 128)
<
bgpic(picname=None)~
:param picname: a string, name of a gif-file or ``"nopic"``, or ``None``
Set background image or return name of current backgroundimage. If {picname}
is a filename, set the corresponding image as background. If {picname} is
``"nopic"``, delete background image, if present. If {picname} is ``None``,
return the filename of the current backgroundimage. :: >
>>> screen.bgpic()
'nopic'
>>> screen.bgpic("landscape.gif")
>>> screen.bgpic()
"landscape.gif"
<
clear()~
clearscreen()
Delete all drawings and all turtles from the TurtleScreen. Reset the now
empty TurtleScreen to its initial state: white background, no background
image, no event bindings and tracing on.
.. note::
This TurtleScreen method is available as a global function only under the
name ``clearscreen``. The global function ``clear`` is another one
derived from the Turtle method ``clear``.
reset()~
resetscreen()
Reset all Turtles on the Screen to their initial state.
.. note::
This TurtleScreen method is available as a global function only under the
name ``resetscreen``. The global function ``reset`` is another one
derived from the Turtle method ``reset``.
screensize(canvwidth=None, canvheight=None, bg=None)~
:param canvwidth: positive integer, new width of canvas in pixels
:param canvheight: positive integer, new height of canvas in pixels
:param bg: colorstring or color-tuple, new background color
If no arguments are given, return current (canvaswidth, canvasheight). Else
resize the canvas the turtles are drawing on. Do not alter the drawing
window. To observe hidden parts of the canvas, use the scrollbars. With this
method, one can make visible those parts of a drawing which were outside the
canvas before.
>>> screen.screensize()
(400, 300)
>>> screen.screensize(2000,1500)
>>> screen.screensize()
(2000, 1500)
e.g. to search for an erroneously escaped turtle ;-)
setworldcoordinates(llx, lly, urx, ury)~
:param llx: a number, x-coordinate of lower left corner of canvas
:param lly: a number, y-coordinate of lower left corner of canvas
:param urx: a number, x-coordinate of upper right corner of canvas
:param ury: a number, y-coordinate of upper right corner of canvas
Set up user-defined coordinate system and switch to mode "world" if
necessary. This performs a ``screen.reset()``. If mode "world" is already
active, all drawings are redrawn according to the new coordinates.
{ATTENTION}*: in user-defined coordinate systems angles may appear
distorted.
.. doctest:: >
>>> screen.reset()
>>> screen.setworldcoordinates(-50,-7.5,50,7.5)
>>> for _ in range(72):
... left(10)
...
>>> for _ in range(8):
... left(45); fd(2) # a regular octagon
<
.. doctest::
:hide:
>>> screen.reset()
>>> for t in turtles():
... t.reset()
Animation control
-----------------
delay(delay=None)~
:param delay: positive integer
Set or return the drawing {delay} in milliseconds. (This is approximately
the time interval between two consecutive canvas updates.) The longer the
drawing delay, the slower the animation.
Optional argument:
.. doctest:: >
>>> screen.delay()
10
>>> screen.delay(5)
>>> screen.delay()
5
<
tracer(n=None, delay=None)~
:param n: nonnegative integer
:param delay: nonnegative integer
Turn turtle animation on/off and set delay for update drawings. If {n} is
given, only each n-th regular screen update is really performed. (Can be
used to accelerate the drawing of complex graphics.) Second argument sets
delay value (see delay).
.. doctest:: >
>>> screen.tracer(8, 25)
>>> dist = 2
>>> for i in range(200):
... fd(dist)
... rt(90)
... dist += 2
<
update()~
Perform a TurtleScreen update. To be used when tracer is turned off.
See also the RawTurtle/Turtle method speed.
Using screen events
-------------------
listen(xdummy=None, ydummy=None)~
Set focus on TurtleScreen (in order to collect key-events). Dummy arguments
are provided in order to be able to pass listen to the onclick method.
onkey(fun, key)~
:param fun: a function with no arguments or ``None``
:param key: a string: key (e.g. "a") or key-symbol (e.g. "space")
Bind {fun} to key-release event of key. If {fun} is ``None``, event bindings
are removed. Remark: in order to be able to register key-events, TurtleScreen
must have the focus. (See method listen.)
.. doctest:: >
>>> def f():
... fd(50)
... lt(60)
...
>>> screen.onkey(f, "Up")
>>> screen.listen()
<
onclick(fun, btn=1, add=None)~
onscreenclick(fun, btn=1, add=None)
:param fun: a function with two arguments which will be called with the
coordinates of the clicked point on the canvas
:param num: number of the mouse-button, defaults to 1 (left mouse button)
:param add: ``True`` or ``False`` -- if ``True``, a new binding will be
added, otherwise it will replace a former binding
Bind {fun} to mouse-click events on this screen. If {fun} is ``None``,
existing bindings are removed.
Example for a TurtleScreen instance named ``screen`` and a Turtle instance
named turtle:
.. doctest:: >
>>> screen.onclick(turtle.goto) # Subsequently clicking into the TurtleScreen will
>>> # make the turtle move to the clicked point.
>>> screen.onclick(None) # remove event binding again
<
.. note::
This TurtleScreen method is available as a global function only under the
name ``onscreenclick``. The global function ``onclick`` is another one
derived from the Turtle method ``onclick``.
ontimer(fun, t=0)~
:param fun: a function with no arguments
:param t: a number >= 0
Install a timer that calls {fun} after {t} milliseconds.
.. doctest:: >
>>> running = True
>>> def f():
... if running:
... fd(50)
... lt(60)
... screen.ontimer(f, 250)
>>> f() ### makes the turtle march around
>>> running = False
<
Settings and special methods
mode(mode=None)~
:param mode: one of the strings "standard", "logo" or "world"
Set turtle mode ("standard", "logo" or "world") and perform reset. If mode
is not given, current mode is returned.
Mode "standard" is compatible with old turtle (|py2stdlib-turtle|). Mode "logo" is
compatible with most Logo turtle graphics. Mode "world" uses user-defined
"world coordinates". {Attention}*: in this mode angles appear distorted if
``x/y`` unit-ratio doesn't equal 1.
============ ========================= ===================
Mode Initial turtle heading positive angles
============ ========================= ===================
"standard" to the right (east) counterclockwise
"logo" upward (north) clockwise
============ ========================= ===================
.. doctest:: >
>>> mode("logo") # resets turtle heading to north
>>> mode()
'logo'
<
colormode(cmode=None)~
:param cmode: one of the values 1.0 or 255
Return the colormode or set it to 1.0 or 255. Subsequently {r}, {g}, {b}
values of color triples have to be in the range 0..\ {cmode}.
.. doctest:: >
>>> screen.colormode(1)
>>> turtle.pencolor(240, 160, 80)
Traceback (most recent call last):
...
TurtleGraphicsError: bad color sequence: (240, 160, 80)
>>> screen.colormode()
1.0
>>> screen.colormode(255)
>>> screen.colormode()
255
>>> turtle.pencolor(240,160,80)
<
getcanvas()~
Return the Canvas of this TurtleScreen. Useful for insiders who know what to
do with a Tkinter Canvas.
.. doctest:: >
>>> cv = screen.getcanvas()
>>> cv
<turtle.ScrolledCanvas instance at 0x...>
<
getshapes()~
Return a list of names of all currently available turtle shapes.
.. doctest:: >
>>> screen.getshapes()
['arrow', 'blank', 'circle', ..., 'turtle']
<
register_shape(name, shape=None)~
addshape(name, shape=None)
There are three different ways to call this function:
(1) {name} is the name of a gif-file and {shape} is ``None``: Install the
corresponding image shape. :: >
>>> screen.register_shape("turtle.gif")
.. note::
Image shapes {do not} rotate when turning the turtle, so they do not
display the heading of the turtle!
<
(2) {name} is an arbitrary string and {shape} is a tuple of pairs of
coordinates: Install the corresponding polygon shape.
.. doctest:: >
>>> screen.register_shape("triangle", ((5,-3), (0,5), (-5,-3)))
<
(3) {name} is an arbitrary string and shape is a (compound) Shape
object: Install the corresponding compound shape.
Add a turtle shape to TurtleScreen's shapelist. Only thusly registered
shapes can be used by issuing the command ``shape(shapename)``.
turtles()~
Return the list of turtles on the screen.
.. doctest:: >
>>> for turtle in screen.turtles():
... turtle.color("red")
<
window_height()~
Return the height of the turtle window. :: >
>>> screen.window_height()
480
<
window_width()~
Return the width of the turtle window. :: >
>>> screen.window_width()
640
<
Methods specific to Screen, not inherited from TurtleScreen
bye()~
Shut the turtlegraphics window.
exitonclick()~
Bind bye() method to mouse clicks on the Screen.
If the value "using_IDLE" in the configuration dictionary is ``False``
(default value), also enter mainloop. Remark: If IDLE with the ``-n`` switch
(no subprocess) is used, this value should be set to ``True`` in
turtle.cfg. In this case IDLE's own mainloop is active also for the
client script.
setup(width=_CFG["width"], height=_CFG["height"], startx=_CFG["leftright"], starty=_CFG["topbottom"])~
Set the size and position of the main window. Default values of arguments
are stored in the configuration dicionary and can be changed via a
turtle.cfg file.
:param width: if an integer, a size in pixels, if a float, a fraction of the
screen; default is 50% of screen
:param height: if an integer, the height in pixels, if a float, a fraction of
the screen; default is 75% of screen
:param startx: if positive, starting position in pixels from the left
edge of the screen, if negative from the right edge, if None,
center window horizontally
:param startx: if positive, starting position in pixels from the top
edge of the screen, if negative from the bottom edge, if None,
center window vertically
.. doctest:: >
>>> screen.setup (width=200, height=200, startx=0, starty=0)
>>> # sets window to 200x200 pixels, in upper left of screen
>>> screen.setup(width=.75, height=0.5, startx=None, starty=None)
>>> # sets window to 75% of screen by 50% of screen and centers
<
title(titlestring)~
:param titlestring: a string that is shown in the titlebar of the turtle
graphics window
Set title of turtle window to {titlestring}.
.. doctest:: >
>>> screen.title("Welcome to the turtle zoo!")
<
The public classes of the module turtle (|py2stdlib-turtle|)
RawTurtle(canvas)~
RawPen(canvas)
:param canvas: a Tkinter.Canvas, a ScrolledCanvas or a
TurtleScreen
Create a turtle. The turtle has all methods described above as "methods of
Turtle/RawTurtle".
Turtle()~
Subclass of RawTurtle, has the same interface but draws on a default
Screen object created automatically when needed for the first time.
TurtleScreen(cv)~
:param cv: a Tkinter.Canvas
Provides screen oriented methods like setbg etc. that are described
above.
Screen()~
Subclass of TurtleScreen, with four methods added <screenspecific>.
ScrolledCanvas(master)~
:param master: some Tkinter widget to contain the ScrolledCanvas, i.e.
a Tkinter-canvas with scrollbars added
Used by class Screen, which thus automatically provides a ScrolledCanvas as
playground for the turtles.
Shape(type_, data)~
:param type\_: one of the strings "polygon", "image", "compound"
Data structure modeling shapes. The pair ``(type_, data)`` must follow this
specification:
=========== ===========
{type_} {data}
=========== ===========
"polygon" a polygon-tuple, i.e. a tuple of pairs of coordinates
"image" an image (in this form only used internally!)
"compound" ``None`` (a compound shape has to be constructed using the
addcomponent method)
=========== ===========
addcomponent(poly, fill, outline=None)~
:param poly: a polygon, i.e. a tuple of pairs of numbers
:param fill: a color the {poly} will be filled with
:param outline: a color for the poly's outline (if given)
Example:
.. doctest:: >
>>> poly = ((0,0),(10,-5),(0,10),(-10,-5))
>>> s = Shape("compound")
>>> s.addcomponent(poly, "red", "blue")
>>> # ... add more components and then use register_shape()
<
See compoundshapes.
Vec2D(x, y)~
A two-dimensional vector class, used as a helper class for implementing
turtle graphics. May be useful for turtle graphics programs too. Derived
from tuple, so a vector is a tuple!
Provides (for {a}, {b} vectors, {k} number):
* ``a + b`` vector addition
* ``a - b`` vector subtraction
{ ``a } b`` inner product
{ ``k } a`` and ``a * k`` multiplication with scalar
* ``abs(a)`` absolute value of a
* ``a.rotate(angle)`` rotation
Help and configuration
======================
How to use help
---------------
The public methods of the Screen and Turtle classes are documented extensively
via docstrings. So these can be used as online-help via the Python help
facilities:
- When using IDLE, tooltips show the signatures and first lines of the
docstrings of typed in function-/method calls.
- Calling help on methods or functions displays the docstrings:: >
>>> help(Screen.bgcolor)
Help on method bgcolor in module turtle:
bgcolor(self, *args) unbound turtle.Screen method
Set or return backgroundcolor of the TurtleScreen.
Arguments (if given): a color string or three numbers
in the range 0..colormode or a 3-tuple of such numbers.
>>> screen.bgcolor("orange")
>>> screen.bgcolor()
"orange"
>>> screen.bgcolor(0.5,0,0.5)
>>> screen.bgcolor()
"#800080"
>>> help(Turtle.penup)
Help on method penup in module turtle:
penup(self) unbound turtle.Turtle method
Pull the pen up -- no drawing when moving.
Aliases: penup | pu | up
No argument
>>> turtle.penup()
<
- The docstrings of the functions which are derived from methods have a modified
form:: >
>>> help(bgcolor)
Help on function bgcolor in module turtle:
bgcolor(*args)
Set or return backgroundcolor of the TurtleScreen.
Arguments (if given): a color string or three numbers
in the range 0..colormode or a 3-tuple of such numbers.
Example::
>>> bgcolor("orange")
>>> bgcolor()
"orange"
>>> bgcolor(0.5,0,0.5)
>>> bgcolor()
"#800080"
>>> help(penup)
Help on function penup in module turtle:
penup()
Pull the pen up -- no drawing when moving.
Aliases: penup | pu | up
No argument
Example:
>>> penup()
<
These modified docstrings are created automatically together with the function
definitions that are derived from the methods at import time.
Translation of docstrings into different languages
--------------------------------------------------
There is a utility to create a dictionary the keys of which are the method names
and the values of which are the docstrings of the public methods of the classes
Screen and Turtle.
write_docstringdict(filename="turtle_docstringdict")~
:param filename: a string, used as filename
Create and write docstring-dictionary to a Python script with the given
filename. This function has to be called explicitly (it is not used by the
turtle graphics classes). The docstring dictionary will be written to the
Python script {filename}.py. It is intended to serve as a template
for translation of the docstrings into different languages.
If you (or your students) want to use turtle (|py2stdlib-turtle|) with online help in your
native language, you have to translate the docstrings and save the resulting
file as e.g. turtle_docstringdict_german.py.
If you have an appropriate entry in your turtle.cfg file this dictionary
will be read in at import time and will replace the original English docstrings.
At the time of this writing there are docstring dictionaries in German and in
Italian. (Requests please to glingl@aon.at.)
How to configure Screen and Turtles
-----------------------------------
The built-in default configuration mimics the appearance and behaviour of the
old turtle module in order to retain best possible compatibility with it.
If you want to use a different configuration which better reflects the features
of this module or which better fits to your needs, e.g. for use in a classroom,
you can prepare a configuration file ``turtle.cfg`` which will be read at import
time and modify the configuration according to its settings.
The built in configuration would correspond to the following turtle.cfg:: >
width = 0.5
height = 0.75
leftright = None
topbottom = None
canvwidth = 400
canvheight = 300
mode = standard
colormode = 1.0
delay = 10
undobuffersize = 1000
shape = classic
pencolor = black
fillcolor = black
resizemode = noresize
visible = True
language = english
exampleturtle = turtle
examplescreen = screen
title = Python Turtle Graphics
using_IDLE = False
<
Short explanation of selected entries:
- The first four lines correspond to the arguments of the Screen.setup
method.
- Line 5 and 6 correspond to the arguments of the method
Screen.screensize.
- {shape} can be any of the built-in shapes, e.g: arrow, turtle, etc. For more
info try ``help(shape)``.
- If you want to use no fillcolor (i.e. make the turtle transparent), you have
to write ``fillcolor = ""`` (but all nonempty strings must not have quotes in
the cfg-file).
- If you want to reflect the turtle its state, you have to use ``resizemode =
auto``.
- If you set e.g. ``language = italian`` the docstringdict
turtle_docstringdict_italian.py will be loaded at import time (if
present on the import path, e.g. in the same directory as turtle (|py2stdlib-turtle|).
- The entries {exampleturtle} and {examplescreen} define the names of these
objects as they occur in the docstrings. The transformation of
method-docstrings to function-docstrings will delete these names from the
docstrings.
- {using_IDLE}: Set this to ``True`` if you regularly work with IDLE and its -n
switch ("no subprocess"). This will prevent exitonclick to enter the
mainloop.
There can be a turtle.cfg file in the directory where turtle (|py2stdlib-turtle|) is
stored and an additional one in the current working directory. The latter will
override the settings of the first one.
The Demo/turtle directory contains a turtle.cfg file. You can
study it as an example and see its effects when running the demos (preferably
not from within the demo-viewer).
Demo scripts
============
There is a set of demo scripts in the turtledemo directory located in the
Demo/turtle directory in the source distribution.
It contains:
- a set of 15 demo scripts demonstrating different features of the new module
turtle (|py2stdlib-turtle|)
- a demo viewer turtleDemo.py which can be used to view the sourcecode
of the scripts and run them at the same time. 14 of the examples can be
accessed via the Examples menu; all of them can also be run standalone.
- The example turtledemo_two_canvases.py demonstrates the simultaneous
use of two canvases with the turtle module. Therefore it only can be run
standalone.
- There is a turtle.cfg file in this directory, which also serves as an
example for how to write and use such files.
The demoscripts are:
+----------------+------------------------------+-----------------------+
| Name | Description | Features |
+----------------+------------------------------+-----------------------+
| bytedesign | complex classical | tracer, delay,|
| | turtlegraphics pattern | update |
+----------------+------------------------------+-----------------------+
| chaos | graphs verhust dynamics, | world coordinates |
| | proves that you must not | |
| | trust computers' computations| |
+----------------+------------------------------+-----------------------+
| clock | analog clock showing time | turtles as clock's |
| | of your computer | hands, ontimer |
+----------------+------------------------------+-----------------------+
| colormixer | experiment with r, g, b | ondrag |
+----------------+------------------------------+-----------------------+
| fractalcurves | Hilbert & Koch curves | recursion |
+----------------+------------------------------+-----------------------+
| lindenmayer | ethnomathematics | L-System |
| | (indian kolams) | |
+----------------+------------------------------+-----------------------+
| minimal_hanoi | Towers of Hanoi | Rectangular Turtles |
| | | as Hanoi discs |
| | | (shape, shapesize) |
+----------------+------------------------------+-----------------------+
| paint | super minimalistic | onclick |
| | drawing program | |
+----------------+------------------------------+-----------------------+
| peace | elementary | turtle: appearance |
| | | and animation |
+----------------+------------------------------+-----------------------+
| penrose | aperiodic tiling with | stamp |
| | kites and darts | |
+----------------+------------------------------+-----------------------+
| planet_and_moon| simulation of | compound shapes, |
| | gravitational system | Vec2D |
+----------------+------------------------------+-----------------------+
| tree | a (graphical) breadth | clone |
| | first tree (using generators)| |
+----------------+------------------------------+-----------------------+
| wikipedia | a pattern from the wikipedia | clone, |
| | article on turtle graphics | undo |
+----------------+------------------------------+-----------------------+
| yingyang | another elementary example | circle |
+----------------+------------------------------+-----------------------+
Have fun!
.. doctest::
:hide:
>>> for turtle in turtles():
... turtle.reset()
>>> turtle.penup()
>>> turtle.goto(-200,25)
>>> turtle.pendown()
>>> turtle.write("No one expects the Spanish Inquisition!",
... font=("Arial", 20, "normal"))
>>> turtle.penup()
>>> turtle.goto(-100,-50)
>>> turtle.pendown()
>>> turtle.write("Our two chief Turtles are...",
... font=("Arial", 16, "normal"))
>>> turtle.penup()
>>> turtle.goto(-450,-75)
>>> turtle.write(str(turtles()))
==============================================================================
*py2stdlib-types*
types~
:synopsis: Names for built-in types.
This module defines names for some object types that are used by the standard
Python interpreter, but not for the types defined by various extension modules.
Also, it does not include some of the types that arise during processing such as
the ``listiterator`` type. It is safe to use ``from types import *`` --- the
module does not export any names besides the ones listed here. New names
exported by future versions of this module will all end in ``Type``.
Typical use is for functions that do different things depending on their
argument types, like the following:: >
from types import *
def delete(mylist, item):
if type(item) is IntType:
del mylist[item]
else:
mylist.remove(item)
<
Starting in Python 2.2, built-in factory functions such as int and
str are also names for the corresponding types. This is now the
preferred way to access the type instead of using the types (|py2stdlib-types|) module.
Accordingly, the example above should be written as follows:: >
def delete(mylist, item):
if isinstance(item, int):
del mylist[item]
else:
mylist.remove(item)
<
The module defines the following names:
NoneType~
The type of ``None``.
TypeType~
.. index:: builtin: type
The type of type objects (such as returned by type); alias of the
built-in type.
BooleanType~
The type of the bool values ``True`` and ``False``; alias of the
built-in bool.
.. versionadded:: 2.3
IntType~
The type of integers (e.g. ``1``); alias of the built-in int.
LongType~
The type of long integers (e.g. ``1L``); alias of the built-in long.
FloatType~
The type of floating point numbers (e.g. ``1.0``); alias of the built-in
float.
ComplexType~
The type of complex numbers (e.g. ``1.0j``). This is not defined if Python was
built without complex number support.
StringType~
The type of character strings (e.g. ``'Spam'``); alias of the built-in
str.
UnicodeType~
The type of Unicode character strings (e.g. ``u'Spam'``). This is not defined
if Python was built without Unicode support. It's an alias of the built-in
unicode.
TupleType~
The type of tuples (e.g. ``(1, 2, 3, 'Spam')``); alias of the built-in
tuple.
ListType~
The type of lists (e.g. ``[0, 1, 2, 3]``); alias of the built-in
list.
DictType~
The type of dictionaries (e.g. ``{'Bacon': 1, 'Ham': 0}``); alias of the
built-in dict.
DictionaryType~
An alternate name for ``DictType``.
FunctionType~
LambdaType
The type of user-defined functions and functions created by lambda
expressions.
GeneratorType~
The type of generator-iterator objects, produced by calling a
generator function.
.. versionadded:: 2.2
CodeType~
.. index:: builtin: compile
The type for code objects such as returned by compile.
ClassType~
The type of user-defined old-style classes.
InstanceType~
The type of instances of user-defined classes.
MethodType~
The type of methods of user-defined class instances.
UnboundMethodType~
An alternate name for ``MethodType``.
BuiltinFunctionType~
BuiltinMethodType
The type of built-in functions like len or sys.exit, and
methods of built-in classes. (Here, the term "built-in" means "written in
C".)
ModuleType~
The type of modules.
FileType~
The type of open file objects such as ``sys.stdout``; alias of the built-in
file.
XRangeType~
.. index:: builtin: xrange
The type of range objects returned by xrange; alias of the built-in
xrange.
SliceType~
.. index:: builtin: slice
The type of objects returned by slice; alias of the built-in
slice.
EllipsisType~
The type of ``Ellipsis``.
TracebackType~
The type of traceback objects such as found in ``sys.exc_traceback``.
FrameType~
The type of frame objects such as found in ``tb.tb_frame`` if ``tb`` is a
traceback object.
BufferType~
.. index:: builtin: buffer
The type of buffer objects created by the buffer function.
DictProxyType~
The type of dict proxies, such as ``TypeType.__dict__``.
NotImplementedType~
The type of ``NotImplemented``
GetSetDescriptorType~
The type of objects defined in extension modules with ``PyGetSetDef``, such
as ``FrameType.f_locals`` or ``array.array.typecode``. This type is used as
descriptor for object attributes; it has the same purpose as the
property type, but for classes defined in extension modules.
.. versionadded:: 2.5
MemberDescriptorType~
The type of objects defined in extension modules with ``PyMemberDef``, such
as ``datetime.timedelta.days``. This type is used as descriptor for simple C
data members which use standard conversion functions; it has the same purpose
as the property type, but for classes defined in extension modules.
.. impl-detail:: >
In other implementations of Python, this type may be identical to
``GetSetDescriptorType``.
<
.. versionadded:: 2.5
StringTypes~
A sequence containing ``StringType`` and ``UnicodeType`` used to facilitate
easier checking for any string object. Using this is more portable than using a
sequence of the two string types constructed elsewhere since it only contains
``UnicodeType`` if it has been built in the running version of Python. For
example: ``isinstance(s, types.StringTypes)``.
.. versionadded:: 2.2
==============================================================================
*py2stdlib-unicodedata*
unicodedata~
:synopsis: Access the Unicode Database.
.. index::
single: Unicode
single: character
pair: Unicode; database
This module provides access to the Unicode Character Database which defines
character properties for all Unicode characters. The data in this database is
based on the UnicodeData.txt file version 5.2.0 which is publicly
available from ftp://ftp.unicode.org/.
The module uses the same names and symbols as defined by the UnicodeData File
Format 5.2.0 (see http://www.unicode.org/reports/tr44/tr44-4.html).
It defines the following functions:
lookup(name)~
Look up character by name. If a character with the given name is found, return
the corresponding Unicode character. If not found, KeyError is raised.
name(unichr[, default])~
Returns the name assigned to the Unicode character {unichr} as a string. If no
name is defined, {default} is returned, or, if not given, ValueError is
raised.
decimal(unichr[, default])~
Returns the decimal value assigned to the Unicode character {unichr} as integer.
If no such value is defined, {default} is returned, or, if not given,
ValueError is raised.
digit(unichr[, default])~
Returns the digit value assigned to the Unicode character {unichr} as integer.
If no such value is defined, {default} is returned, or, if not given,
ValueError is raised.
numeric(unichr[, default])~
Returns the numeric value assigned to the Unicode character {unichr} as float.
If no such value is defined, {default} is returned, or, if not given,
ValueError is raised.
category(unichr)~
Returns the general category assigned to the Unicode character {unichr} as
string.
bidirectional(unichr)~
Returns the bidirectional category assigned to the Unicode character {unichr} as
string. If no such value is defined, an empty string is returned.
combining(unichr)~
Returns the canonical combining class assigned to the Unicode character {unichr}
as integer. Returns ``0`` if no combining class is defined.
east_asian_width(unichr)~
Returns the east asian width assigned to the Unicode character {unichr} as
string.
.. versionadded:: 2.4
mirrored(unichr)~
Returns the mirrored property assigned to the Unicode character {unichr} as
integer. Returns ``1`` if the character has been identified as a "mirrored"
character in bidirectional text, ``0`` otherwise.
decomposition(unichr)~
Returns the character decomposition mapping assigned to the Unicode character
{unichr} as string. An empty string is returned in case no such mapping is
defined.
normalize(form, unistr)~
Return the normal form {form} for the Unicode string {unistr}. Valid values for
{form} are 'NFC', 'NFKC', 'NFD', and 'NFKD'.
The Unicode standard defines various normalization forms of a Unicode string,
based on the definition of canonical equivalence and compatibility equivalence.
In Unicode, several characters can be expressed in various way. For example, the
character U+00C7 (LATIN CAPITAL LETTER C WITH CEDILLA) can also be expressed as
the sequence U+0327 (COMBINING CEDILLA) U+0043 (LATIN CAPITAL LETTER C).
For each character, there are two normal forms: normal form C and normal form D.
Normal form D (NFD) is also known as canonical decomposition, and translates
each character into its decomposed form. Normal form C (NFC) first applies a
canonical decomposition, then composes pre-combined characters again.
In addition to these two forms, there are two additional normal forms based on
compatibility equivalence. In Unicode, certain characters are supported which
normally would be unified with other characters. For example, U+2160 (ROMAN
NUMERAL ONE) is really the same thing as U+0049 (LATIN CAPITAL LETTER I).
However, it is supported in Unicode for compatibility with existing character
sets (e.g. gb2312).
The normal form KD (NFKD) will apply the compatibility decomposition, i.e.
replace all compatibility characters with their equivalents. The normal form KC
(NFKC) first applies the compatibility decomposition, followed by the canonical
composition.
Even if two unicode strings are normalized and look the same to
a human reader, if one has combining characters and the other
doesn't, they may not compare equal.
.. versionadded:: 2.3
In addition, the module exposes the following constant:
unidata_version~
The version of the Unicode database used in this module.
.. versionadded:: 2.3
ucd_3_2_0~
This is an object that has the same methods as the entire module, but uses the
Unicode database version 3.2 instead, for applications that require this
specific version of the Unicode database (such as IDNA).
.. versionadded:: 2.5
Examples:
>>> import unicodedata
>>> unicodedata.lookup('LEFT CURLY BRACKET')
u'{'
>>> unicodedata.name(u'/')
'SOLIDUS'
>>> unicodedata.decimal(u'9')
9
>>> unicodedata.decimal(u'a')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
ValueError: not a decimal
>>> unicodedata.category(u'A') # 'L'etter, 'u'ppercase
'Lu'
>>> unicodedata.bidirectional(u'\u0660') # 'A'rabic, 'N'umber
'AN'
==============================================================================
*py2stdlib-unittest*
unittest~
:synopsis: Unit testing framework for Python.
.. versionadded:: 2.1
The Python unit testing framework, sometimes referred to as "PyUnit," is a
Python language version of JUnit, by Kent Beck and Erich Gamma. JUnit is, in
turn, a Java version of Kent's Smalltalk testing framework. Each is the de
facto standard unit testing framework for its respective language.
unittest (|py2stdlib-unittest|) supports test automation, sharing of setup and shutdown code for
tests, aggregation of tests into collections, and independence of the tests from
the reporting framework. The unittest (|py2stdlib-unittest|) module provides classes that make
it easy to support these qualities for a set of tests.
To achieve this, unittest (|py2stdlib-unittest|) supports some important concepts:
test fixture
A test fixture represents the preparation needed to perform one or more
tests, and any associate cleanup actions. This may involve, for example,
creating temporary or proxy databases, directories, or starting a server
process.
test case
A test case is the smallest unit of testing. It checks for a specific
response to a particular set of inputs. unittest (|py2stdlib-unittest|) provides a base class,
TestCase, which may be used to create new test cases.
test suite
A test suite is a collection of test cases, test suites, or both. It is
used to aggregate tests that should be executed together.
test runner
A test runner is a component which orchestrates the execution of tests
and provides the outcome to the user. The runner may use a graphical interface,
a textual interface, or return a special value to indicate the results of
executing the tests.
The test case and test fixture concepts are supported through the
TestCase and FunctionTestCase classes; the former should be
used when creating new tests, and the latter can be used when integrating
existing test code with a unittest (|py2stdlib-unittest|)\ -driven framework. When building test
fixtures using TestCase, the TestCase.setUp and
TestCase.tearDown methods can be overridden to provide initialization
and cleanup for the fixture. With FunctionTestCase, existing functions
can be passed to the constructor for these purposes. When the test is run, the
fixture initialization is run first; if it succeeds, the cleanup method is run
after the test has been executed, regardless of the outcome of the test. Each
instance of the TestCase will only be used to run a single test method,
so a new fixture is created for each test.
Test suites are implemented by the TestSuite class. This class allows
individual tests and test suites to be aggregated; when the suite is executed,
all tests added directly to the suite and in "child" test suites are run.
A test runner is an object that provides a single method,
TestRunner.run, which accepts a TestCase or TestSuite
object as a parameter, and returns a result object. The class
TestResult is provided for use as the result object. unittest (|py2stdlib-unittest|)
provides the TextTestRunner as an example test runner which reports
test results on the standard error stream by default. Alternate runners can be
implemented for other environments (such as graphical environments) without any
need to derive from a specific class.
.. seealso::
Module doctest (|py2stdlib-doctest|)
Another test-support module with a very different flavor.
`unittest2: A backport of new unittest features for Python 2.4-2.6 <http://pypi.python.org/pypi/unittest2>`_
Many new features were added to unittest in Python 2.7, including test
discovery. unittest2 allows you to use these features with earlier
versions of Python.
`Simple Smalltalk Testing: With Patterns <http://www.XProgramming.com/testfram.htm>`_
Kent Beck's original paper on testing frameworks using the pattern shared
by unittest (|py2stdlib-unittest|).
`Nose <http://code.google.com/p/python-nose/>`_ and `py.test <http://pytest.org>`_
Third-party unittest frameworks with a lighter-weight syntax for writing
tests. For example, ``assert func(10) == 42``.
`The Python Testing Tools Taxonomy <http://pycheesecake.org/wiki/PythonTestingToolsTaxonomy>`_
An extensive list of Python testing tools including functional testing
frameworks and mock object libraries.
`Testing in Python Mailing List <http://lists.idyll.org/listinfo/testing-in-python>`_
A special-interest-group for discussion of testing, and testing tools,
in Python.
Basic example
-------------
The unittest (|py2stdlib-unittest|) module provides a rich set of tools for constructing and
running tests. This section demonstrates that a small subset of the tools
suffice to meet the needs of most users.
Here is a short script to test three functions from the random (|py2stdlib-random|) module:: >
import random
import unittest
class TestSequenceFunctions(unittest.TestCase):
def setUp(self):
self.seq = range(10)
def test_shuffle(self):
# make sure the shuffled sequence does not lose any elements
random.shuffle(self.seq)
self.seq.sort()
self.assertEqual(self.seq, range(10))
# should raise an exception for an immutable sequence
self.assertRaises(TypeError, random.shuffle, (1,2,3))
def test_choice(self):
element = random.choice(self.seq)
self.assertTrue(element in self.seq)
def test_sample(self):
with self.assertRaises(ValueError):
random.sample(self.seq, 20)
for element in random.sample(self.seq, 5):
self.assertTrue(element in self.seq)
if __name__ == '__main__':
unittest.main()
<
A testcase is created by subclassing unittest.TestCase. The three
individual tests are defined with methods whose names start with the letters
``test``. This naming convention informs the test runner about which methods
represent tests.
The crux of each test is a call to TestCase.assertEqual to check for an
expected result; TestCase.assertTrue to verify a condition; or
TestCase.assertRaises to verify that an expected exception gets raised.
These methods are used instead of the assert statement so the test
runner can accumulate all test results and produce a report.
When a TestCase.setUp method is defined, the test runner will run that
method prior to each test. Likewise, if a TestCase.tearDown method is
defined, the test runner will invoke that method after each test. In the
example, TestCase.setUp was used to create a fresh sequence for each
test.
The final block shows a simple way to run the tests. unittest.main
provides a command line interface to the test script. When run from the command
line, the above script produces an output that looks like this:: >
...
Ran 3 tests in 0.000s
OK
<
Instead of unittest.main, there are other ways to run the tests with a
finer level of control, less terse output, and no requirement to be run from the
command line. For example, the last two lines may be replaced with:: >
suite = unittest.TestLoader().loadTestsFromTestCase(TestSequenceFunctions)
unittest.TextTestRunner(verbosity=2).run(suite)
<
Running the revised script from the interpreter or another script produces the
following output:: >
test_choice (__main__.TestSequenceFunctions) ... ok
test_sample (__main__.TestSequenceFunctions) ... ok
test_shuffle (__main__.TestSequenceFunctions) ... ok
Ran 3 tests in 0.110s
OK
<
The above examples show the most commonly used unittest (|py2stdlib-unittest|) features which
are sufficient to meet many everyday testing needs. The remainder of the
documentation explores the full feature set from first principles.
Command Line Interface
----------------------
The unittest module can be used from the command line to run tests from
modules, classes or even individual test methods:: >
python -m unittest test_module1 test_module2
python -m unittest test_module.TestClass
python -m unittest test_module.TestClass.test_method
<
You can pass in a list with any combination of module names, and fully
qualified class or method names.
You can run tests with more detail (higher verbosity) by passing in the -v flag:: >
python -m unittest -v test_module
<
For a list of all the command line options::
python -m unittest -h
.. versionchanged:: 2.7
In earlier versions it was only possible to run individual test methods and
not modules or classes.
failfast, catch and buffer command line options
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
unittest supports three command options.
* -b / --buffer
The standard output and standard error streams are buffered during the test
run. Output during a passing test is discarded. Output is echoed normally
on test fail or error and is added to the failure messages.
* -c / --catch
Control-C during the test run waits for the current test to end and then
reports all the results so far. A second control-C raises the normal
KeyboardInterrupt exception.
See `Signal Handling`_ for the functions that provide this functionality.
* -f / --failfast
Stop the test run on the first error or failure.
.. versionadded:: 2.7
The command line options ``-c``, ``-b`` and ``-f`` were added.
The command line can also be used for test discovery, for running all of the
tests in a project or just a subset.
Test Discovery
--------------
.. versionadded:: 2.7
Unittest supports simple test discovery. For a project's tests to be
compatible with test discovery they must all be importable from the top level
directory of the project (in other words, they must all be in Python packages).
Test discovery is implemented in TestLoader.discover, but can also be
used from the command line. The basic command line usage is:: >
cd project_directory
python -m unittest discover
<
The ``discover`` sub-command has the following options:
-v, --verbose Verbose output
-s directory Directory to start discovery ('.' default)
-p pattern Pattern to match test files ('test*.py' default)
-t directory Top level directory of project (default to
start directory)
The -s, -p, and -t options can be passed in
as positional arguments in that order. The following two command lines
are equivalent:: >
python -m unittest discover -s project_directory -p '*_test.py'
python -m unittest discover project_directory '*_test.py'
<
As well as being a path it is possible to pass a package name, for example
``myproject.subpackage.test``, as the start directory. The package name you
supply will then be imported and its location on the filesystem will be used
as the start directory.
.. caution::
Test discovery loads tests by importing them. Once test discovery has
found all the test files from the start directory you specify it turns the
paths into package names to import. For example `foo/bar/baz.py` will be
imported as ``foo.bar.baz``.
If you have a package installed globally and attempt test discovery on
a different copy of the package then the import {could} happen from the
wrong place. If this happens test discovery will warn you and exit.
If you supply the start directory as a package name rather than a
path to a directory then discover assumes that whichever location it
imports from is the location you intended, so you will not get the
warning.
Test modules and packages can customize test loading and discovery by through
the `load_tests protocol`_.
Organizing test code
--------------------
The basic building blocks of unit testing are test cases --- single
scenarios that must be set up and checked for correctness. In unittest (|py2stdlib-unittest|),
test cases are represented by instances of unittest (|py2stdlib-unittest|)'s TestCase
class. To make your own test cases you must write subclasses of
TestCase, or use FunctionTestCase.
An instance of a TestCase\ -derived class is an object that can
completely run a single test method, together with optional set-up and tidy-up
code.
The testing code of a TestCase instance should be entirely self
contained, such that it can be run either in isolation or in arbitrary
combination with any number of other test cases.
The simplest TestCase subclass will simply override the
TestCase.runTest method in order to perform specific testing code:: >
import unittest
class DefaultWidgetSizeTestCase(unittest.TestCase):
def runTest(self):
widget = Widget('The widget')
self.assertEqual(widget.size(), (50, 50), 'incorrect default size')
<
Note that in order to test something, we use the one of the assert\*
methods provided by the TestCase base class. If the test fails, an
exception will be raised, and unittest (|py2stdlib-unittest|) will identify the test case as a
failure. Any other exceptions will be treated as errors. This
helps you identify where the problem is: failures are caused by incorrect
results - a 5 where you expected a 6. Errors are caused by incorrect
code - e.g., a TypeError caused by an incorrect function call.
The way to run a test case will be described later. For now, note that to
construct an instance of such a test case, we call its constructor without
arguments:: >
testCase = DefaultWidgetSizeTestCase()
<
Now, such test cases can be numerous, and their set-up can be repetitive. In
the above case, constructing a Widget in each of 100 Widget test case
subclasses would mean unsightly duplication.
Luckily, we can factor out such set-up code by implementing a method called
TestCase.setUp, which the testing framework will automatically call for
us when we run the test:: >
import unittest
class SimpleWidgetTestCase(unittest.TestCase):
def setUp(self):
self.widget = Widget('The widget')
class DefaultWidgetSizeTestCase(SimpleWidgetTestCase):
def runTest(self):
self.assertEqual(self.widget.size(), (50,50),
'incorrect default size')
class WidgetResizeTestCase(SimpleWidgetTestCase):
def runTest(self):
self.widget.resize(100,150)
self.assertEqual(self.widget.size(), (100,150),
'wrong size after resize')
<
If the TestCase.setUp method raises an exception while the test is
running, the framework will consider the test to have suffered an error, and the
TestCase.runTest method will not be executed.
Similarly, we can provide a TestCase.tearDown method that tidies up
after the TestCase.runTest method has been run:: >
import unittest
class SimpleWidgetTestCase(unittest.TestCase):
def setUp(self):
self.widget = Widget('The widget')
def tearDown(self):
self.widget.dispose()
self.widget = None
<
If TestCase.setUp succeeded, the TestCase.tearDown method will
be run whether TestCase.runTest succeeded or not.
Such a working environment for the testing code is called a fixture.
Often, many small test cases will use the same fixture. In this case, we would
end up subclassing SimpleWidgetTestCase into many small one-method
classes such as DefaultWidgetSizeTestCase. This is time-consuming and
discouraging, so in the same vein as JUnit, unittest (|py2stdlib-unittest|) provides a simpler
mechanism:: >
import unittest
class WidgetTestCase(unittest.TestCase):
def setUp(self):
self.widget = Widget('The widget')
def tearDown(self):
self.widget.dispose()
self.widget = None
def test_default_size(self):
self.assertEqual(self.widget.size(), (50,50),
'incorrect default size')
def test_resize(self):
self.widget.resize(100,150)
self.assertEqual(self.widget.size(), (100,150),
'wrong size after resize')
<
Here we have not provided a TestCase.runTest method, but have instead
provided two different test methods. Class instances will now each run one of
the test_\* methods, with ``self.widget`` created and destroyed
separately for each instance. When creating an instance we must specify the
test method it is to run. We do this by passing the method name in the
constructor:: >
defaultSizeTestCase = WidgetTestCase('test_default_size')
resizeTestCase = WidgetTestCase('test_resize')
<
Test case instances are grouped together according to the features they test.
unittest (|py2stdlib-unittest|) provides a mechanism for this: the test suite,
represented by unittest (|py2stdlib-unittest|)'s TestSuite class:: >
widgetTestSuite = unittest.TestSuite()
widgetTestSuite.addTest(WidgetTestCase('test_default_size'))
widgetTestSuite.addTest(WidgetTestCase('test_resize'))
<
For the ease of running tests, as we will see later, it is a good idea to
provide in each test module a callable object that returns a pre-built test
suite:: >
def suite():
suite = unittest.TestSuite()
suite.addTest(WidgetTestCase('test_default_size'))
suite.addTest(WidgetTestCase('test_resize'))
return suite
<
or even::
def suite():
tests = ['test_default_size', 'test_resize']
return unittest.TestSuite(map(WidgetTestCase, tests))
Since it is a common pattern to create a TestCase subclass with many
similarly named test functions, unittest (|py2stdlib-unittest|) provides a TestLoader
class that can be used to automate the process of creating a test suite and
populating it with individual tests. For example, :: >
suite = unittest.TestLoader().loadTestsFromTestCase(WidgetTestCase)
<
will create a test suite that will run ``WidgetTestCase.test_default_size()`` and
``WidgetTestCase.test_resize``. TestLoader uses the ``'test'`` method
name prefix to identify test methods automatically.
Note that the order in which the various test cases will be run is determined by
sorting the test function names with the built-in cmp function.
Often it is desirable to group suites of test cases together, so as to run tests
for the whole system at once. This is easy, since TestSuite instances
can be added to a TestSuite just as TestCase instances can be
added to a TestSuite:: >
suite1 = module1.TheTestSuite()
suite2 = module2.TheTestSuite()
alltests = unittest.TestSuite([suite1, suite2])
<
You can place the definitions of test cases and test suites in the same modules
as the code they are to test (such as widget.py), but there are several
advantages to placing the test code in a separate module, such as
test_widget.py:
* The test module can be run standalone from the command line.
* The test code can more easily be separated from shipped code.
* There is less temptation to change test code to fit the code it tests without
a good reason.
* Test code should be modified much less frequently than the code it tests.
* Tested code can be refactored more easily.
* Tests for modules written in C must be in separate modules anyway, so why not
be consistent?
* If the testing strategy changes, there is no need to change the source code.
Re-using old test code
----------------------
Some users will find that they have existing test code that they would like to
run from unittest (|py2stdlib-unittest|), without converting every old test function to a
TestCase subclass.
For this reason, unittest (|py2stdlib-unittest|) provides a FunctionTestCase class.
This subclass of TestCase can be used to wrap an existing test
function. Set-up and tear-down functions can also be provided.
Given the following test function:: >
def testSomething():
something = makeSomething()
assert something.name is not None
# ...
<
one can create an equivalent test case instance as follows::
testcase = unittest.FunctionTestCase(testSomething)
If there are additional set-up and tear-down methods that should be called as
part of the test case's operation, they can also be provided like so:: >
testcase = unittest.FunctionTestCase(testSomething,
setUp=makeSomethingDB,
tearDown=deleteSomethingDB)
<
To make migrating existing test suites easier, unittest (|py2stdlib-unittest|) supports tests
raising AssertionError to indicate test failure. However, it is
recommended that you use the explicit TestCase.fail\* and
TestCase.assert\* methods instead, as future versions of unittest (|py2stdlib-unittest|)
may treat AssertionError differently.
.. note::
Even though FunctionTestCase can be used to quickly convert an
existing test base over to a unittest (|py2stdlib-unittest|)\ -based system, this approach is
not recommended. Taking the time to set up proper TestCase
subclasses will make future test refactorings infinitely easier.
In some cases, the existing tests may have been written using the doctest (|py2stdlib-doctest|)
module. If so, doctest (|py2stdlib-doctest|) provides a DocTestSuite class that can
automatically build unittest.TestSuite instances from the existing
doctest (|py2stdlib-doctest|)\ -based tests.
Skipping tests and expected failures
------------------------------------
.. versionadded:: 2.7
Unittest supports skipping individual test methods and even whole classes of
tests. In addition, it supports marking a test as a "expected failure," a test
that is broken and will fail, but shouldn't be counted as a failure on a
TestResult.
Skipping a test is simply a matter of using the skip decorator
or one of its conditional variants.
Basic skipping looks like this: :: >
class MyTestCase(unittest.TestCase):
@unittest.skip("demonstrating skipping")
def test_nothing(self):
self.fail("shouldn't happen")
@unittest.skipIf(mylib.__version__ < (1, 3),
"not supported in this library version")
def test_format(self):
# Tests that work for only a certain version of the library.
pass
@unittest.skipUnless(sys.platform.startswith("win"), "requires Windows")
def test_windows_support(self):
# windows specific testing code
pass
<
This is the output of running the example above in verbose mode: ::
test_format (__main__.MyTestCase) ... skipped 'not supported in this library version'
test_nothing (__main__.MyTestCase) ... skipped 'demonstrating skipping'
test_windows_support (__main__.MyTestCase) ... skipped 'requires Windows'
Ran 3 tests in 0.005s
OK (skipped=3)
Classes can be skipped just like methods: :: >
@skip("showing class skipping")
class MySkippedTestCase(unittest.TestCase):
def test_not_run(self):
pass
<
TestCase.setUp can also skip the test. This is useful when a resource
that needs to be set up is not available.
Expected failures use the expectedFailure decorator. :: >
class ExpectedFailureTestCase(unittest.TestCase):
@unittest.expectedFailure
def test_fail(self):
self.assertEqual(1, 0, "broken")
<
It's easy to roll your own skipping decorators by making a decorator that calls
skip on the test when it wants it to be skipped. This decorator skips
the test unless the passed object has a certain attribute: :: >
def skipUnlessHasattr(obj, attr):
if hasattr(obj, attr):
return lambda func: func
return unittest.skip("{0!r} doesn't have {1!r}".format(obj, attr))
<
The following decorators implement test skipping and expected failures:
skip(reason)~
Unconditionally skip the decorated test. {reason} should describe why the
test is being skipped.
skipIf(condition, reason)~
Skip the decorated test if {condition} is true.
skipUnless(condition, reason)~
Skip the decoratored test unless {condition} is true.
expectedFailure~
Mark the test as an expected failure. If the test fails when run, the test
is not counted as a failure.
Skipped tests will not have setUp or tearDown run around them.
Skipped classes will not have setUpClass or tearDownClass run.
Classes and functions
---------------------
This section describes in depth the API of unittest (|py2stdlib-unittest|).
Test cases
~~~~~~~~~~
TestCase([methodName])~
Instances of the TestCase class represent the smallest testable units
in the unittest (|py2stdlib-unittest|) universe. This class is intended to be used as a base
class, with specific tests being implemented by concrete subclasses. This class
implements the interface needed by the test runner to allow it to drive the
test, and methods that the test code can use to check for and report various
kinds of failure.
Each instance of TestCase will run a single test method: the method
named {methodName}. If you remember, we had an earlier example that went
something like this:: >
def suite():
suite = unittest.TestSuite()
suite.addTest(WidgetTestCase('test_default_size'))
suite.addTest(WidgetTestCase('test_resize'))
return suite
<
Here, we create two instances of WidgetTestCase, each of which runs a
single test.
{methodName} defaults to runTest.
TestCase instances provide three groups of methods: one group used
to run the test, another used by the test implementation to check conditions
and report failures, and some inquiry methods allowing information about the
test itself to be gathered.
Methods in the first group (running the test) are:
setUp()~
Method called to prepare the test fixture. This is called immediately
before calling the test method; any exception raised by this method will
be considered an error rather than a test failure. The default
implementation does nothing.
tearDown()~
Method called immediately after the test method has been called and the
result recorded. This is called even if the test method raised an
exception, so the implementation in subclasses may need to be particularly
careful about checking internal state. Any exception raised by this
method will be considered an error rather than a test failure. This
method will only be called if the setUp succeeds, regardless of
the outcome of the test method. The default implementation does nothing.
setUpClass()~
A class method called before tests in an individual class run.
``setUpClass`` is called with the class as the only argument
and must be decorated as a classmethod:: >
@classmethod
def setUpClass(cls):
...
<
See `Class and Module Fixtures`_ for more details.
.. versionadded:: 2.7
tearDownClass()~
A class method called after tests in an individual class have run.
``tearDownClass`` is called with the class as the only argument
and must be decorated as a classmethod:: >
@classmethod
def tearDownClass(cls):
...
<
See `Class and Module Fixtures`_ for more details.
.. versionadded:: 2.7
run([result])~
Run the test, collecting the result into the test result object passed as
{result}. If {result} is omitted or None, a temporary result
object is created (by calling the defaultTestResult method) and
used. The result object is not returned to run's caller.
The same effect may be had by simply calling the TestCase
instance.
skipTest(reason)~
Calling this during a test method or setUp skips the current
test. See unittest-skipping for more information.
.. versionadded:: 2.7
debug()~
Run the test without collecting the result. This allows exceptions raised
by the test to be propagated to the caller, and can be used to support
running tests under a debugger.
The test code can use any of the following methods to check for and report
failures.
assertTrue(expr[, msg])~
assert_(expr[, msg])
failUnless(expr[, msg])
Signal a test failure if {expr} is false; the explanation for the failure
will be {msg} if given, otherwise it will be None.
2.7~
failUnless and assert_; use assertTrue.
assertEqual(first, second[, msg])~
failUnlessEqual(first, second[, msg])
Test that {first} and {second} are equal. If the values do not compare
equal, the test will fail with the explanation given by {msg}, or
None. Note that using assertEqual improves upon
doing the comparison as the first parameter to assertTrue: the
default value for {msg} include representations of both {first} and
{second}.
In addition, if {first} and {second} are the exact same type and one of
list, tuple, dict, set, frozenset or unicode or any type that a subclass
registers with addTypeEqualityFunc the type specific equality
function will be called in order to generate a more useful default error
message.
.. versionchanged:: 2.7
Added the automatic calling of type specific equality function.
2.7~
failUnlessEqual; use assertEqual.
assertNotEqual(first, second[, msg])~
failIfEqual(first, second[, msg])
Test that {first} and {second} are not equal. If the values do compare
equal, the test will fail with the explanation given by {msg}, or
None. Note that using assertNotEqual improves upon doing
the comparison as the first parameter to assertTrue is that the
default value for {msg} can be computed to include representations of both
{first} and {second}.
2.7~
failIfEqual; use assertNotEqual.
assertAlmostEqual(first, second[, places[, msg[, delta]]])~
failUnlessAlmostEqual(first, second[, places[, msg[, delta]]])
Test that {first} and {second} are approximately equal by computing the
difference, rounding to the given number of decimal {places} (default 7),
and comparing to zero.
Note that comparing a given number of decimal places is not the same as
comparing a given number of significant digits. If the values do not
compare equal, the test will fail with the explanation given by {msg}, or
None.
If {delta} is supplied instead of {places} then the difference
between {first} and {second} must be less than {delta}.
Supplying both {delta} and {places} raises a ``TypeError``.
.. versionchanged:: 2.7
Objects that compare equal are automatically almost equal.
Added the ``delta`` keyword argument.
2.7~
failUnlessAlmostEqual; use assertAlmostEqual.
assertNotAlmostEqual(first, second[, places[, msg[, delta]]])~
failIfAlmostEqual(first, second[, places[, msg[, delta]]])
Test that {first} and {second} are not approximately equal by computing
the difference, rounding to the given number of decimal {places} (default
7), and comparing to zero.
Note that comparing a given number of decimal places is not the same as
comparing a given number of significant digits. If the values do not
compare equal, the test will fail with the explanation given by {msg}, or
None.
If {delta} is supplied instead of {places} then the difference
between {first} and {second} must be more than {delta}.
Supplying both {delta} and {places} raises a ``TypeError``.
.. versionchanged:: 2.7
Objects that compare equal automatically fail.
Added the ``delta`` keyword argument.
2.7~
failIfAlmostEqual; use assertNotAlmostEqual.
assertGreater(first, second, msg=None)~
assertGreaterEqual(first, second, msg=None)
assertLess(first, second, msg=None)
assertLessEqual(first, second, msg=None)
Test that {first} is respectively >, >=, < or <= than {second} depending
on the method name. If not, the test will fail with an explanation
or with the explanation given by {msg}:: >
>>> self.assertGreaterEqual(3, 4)
AssertionError: "3" unexpectedly not greater than or equal to "4"
<
.. versionadded:: 2.7
assertMultiLineEqual(self, first, second, msg=None)~
Test that the multiline string {first} is equal to the string {second}.
When not equal a diff of the two strings highlighting the differences
will be included in the error message. This method is used by default
when comparing Unicode strings with assertEqual.
If specified, {msg} will be used as the error message on failure.
.. versionadded:: 2.7
assertRegexpMatches(text, regexp, msg=None)~
Verifies that a {regexp} search matches {text}. Fails with an error
message including the pattern and the {text}. {regexp} may be
a regular expression object or a string containing a regular expression
suitable for use by re.search.
.. versionadded:: 2.7
assertNotRegexpMatches(text, regexp, msg=None)~
Verifies that a {regexp} search does not match {text}. Fails with an error
message including the pattern and the part of {text} that matches. {regexp}
may be a regular expression object or a string containing a regular
expression suitable for use by re.search.
.. versionadded:: 2.7
assertIn(first, second, msg=None)~
assertNotIn(first, second, msg=None)
Tests that {first} is or is not in {second} with an explanatory error
message as appropriate.
If specified, {msg} will be used as the error message on failure.
.. versionadded:: 2.7
assertItemsEqual(actual, expected, msg=None)~
Test that sequence {expected} contains the same elements as {actual},
regardless of their order. When they don't, an error message listing the
differences between the sequences will be generated.
Duplicate elements are {not} ignored when comparing {actual} and
{expected}. It verifies if each element has the same count in both
sequences. It is the equivalent of ``assertEqual(sorted(expected),
sorted(actual))`` but it works with sequences of unhashable objects as
well.
If specified, {msg} will be used as the error message on failure.
.. versionadded:: 2.7
assertSetEqual(set1, set2, msg=None)~
Tests that two sets are equal. If not, an error message is constructed
that lists the differences between the sets. This method is used by
default when comparing sets or frozensets with assertEqual.
Fails if either of {set1} or {set2} does not have a set.difference
method.
If specified, {msg} will be used as the error message on failure.
.. versionadded:: 2.7
assertDictEqual(expected, actual, msg=None)~
Test that two dictionaries are equal. If not, an error message is
constructed that shows the differences in the dictionaries. This
method will be used by default to compare dictionaries in
calls to assertEqual.
If specified, {msg} will be used as the error message on failure.
.. versionadded:: 2.7
assertDictContainsSubset(expected, actual, msg=None)~
Tests whether the key/value pairs in dictionary {actual} are a
superset of those in {expected}. If not, an error message listing
the missing keys and mismatched values is generated.
If specified, {msg} will be used as the error message on failure.
.. versionadded:: 2.7
assertListEqual(list1, list2, msg=None)~
assertTupleEqual(tuple1, tuple2, msg=None)
Tests that two lists or tuples are equal. If not an error message is
constructed that shows only the differences between the two. An error
is also raised if either of the parameters are of the wrong type.
These methods are used by default when comparing lists or tuples with
assertEqual.
If specified, {msg} will be used as the error message on failure.
.. versionadded:: 2.7
assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)~
Tests that two sequences are equal. If a {seq_type} is supplied, both
{seq1} and {seq2} must be instances of {seq_type} or a failure will
be raised. If the sequences are different an error message is
constructed that shows the difference between the two.
If specified, {msg} will be used as the error message on failure.
This method is used to implement assertListEqual and
assertTupleEqual.
.. versionadded:: 2.7
assertRaises(exception[, callable, ...])~
failUnlessRaises(exception[, callable, ...])
Test that an exception is raised when {callable} is called with any
positional or keyword arguments that are also passed to
assertRaises. The test passes if {exception} is raised, is an
error if another exception is raised, or fails if no exception is raised.
To catch any of a group of exceptions, a tuple containing the exception
classes may be passed as {exception}.
If {callable} is omitted or None, returns a context manager so that the
code under test can be written inline rather than as a function:: >
with self.assertRaises(SomeException):
do_something()
<
The context manager will store the caught exception object in its
exception attribute. This can be useful if the intention
is to perform additional checks on the exception raised:: >
with self.assertRaises(SomeException) as cm:
do_something()
the_exception = cm.exception
self.assertEqual(the_exception.error_code, 3)
<
.. versionchanged:: 2.7
Added the ability to use assertRaises as a context manager.
2.7~
failUnlessRaises; use assertRaises.
assertRaisesRegexp(exception, regexp[, callable, ...])~
Like assertRaises but also tests that {regexp} matches
on the string representation of the raised exception. {regexp} may be
a regular expression object or a string containing a regular expression
suitable for use by re.search. Examples:: >
self.assertRaisesRegexp(ValueError, 'invalid literal for.*XYZ$',
int, 'XYZ')
<
or::
with self.assertRaisesRegexp(ValueError, 'literal'):
int('XYZ')
.. versionadded:: 2.7
assertIsNone(expr[, msg])~
This signals a test failure if {expr} is not None.
.. versionadded:: 2.7
assertIsNotNone(expr[, msg])~
The inverse of the assertIsNone method.
This signals a test failure if {expr} is None.
.. versionadded:: 2.7
assertIs(expr1, expr2[, msg])~
This signals a test failure if {expr1} and {expr2} don't evaluate to the same
object.
.. versionadded:: 2.7
assertIsNot(expr1, expr2[, msg])~
The inverse of the assertIs method.
This signals a test failure if {expr1} and {expr2} evaluate to the same
object.
.. versionadded:: 2.7
assertIsInstance(obj, cls[, msg])~
This signals a test failure if {obj} is not an instance of {cls} (which
can be a class or a tuple of classes, as supported by isinstance).
.. versionadded:: 2.7
assertNotIsInstance(obj, cls[, msg])~
The inverse of the assertIsInstance method. This signals a test
failure if {obj} is an instance of {cls}.
.. versionadded:: 2.7
assertFalse(expr[, msg])~
failIf(expr[, msg])
The inverse of the assertTrue method is the assertFalse method.
This signals a test failure if {expr} is true, with {msg} or None
for the error message.
2.7~
failIf; use assertFalse.
fail([msg])~
Signals a test failure unconditionally, with {msg} or None for
the error message.
failureException~
This class attribute gives the exception raised by the test method. If a
test framework needs to use a specialized exception, possibly to carry
additional information, it must subclass this exception in order to "play
fair" with the framework. The initial value of this attribute is
AssertionError.
longMessage~
If set to True then any explicit failure message you pass in to the
assert methods will be appended to the end of the normal failure message.
The normal messages contain useful information about the objects involved,
for example the message from assertEqual shows you the repr of the two
unequal objects. Setting this attribute to True allows you to have a
custom error message in addition to the normal one.
This attribute defaults to False, meaning that a custom message passed
to an assert method will silence the normal message.
The class setting can be overridden in individual tests by assigning an
instance attribute to True or False before calling the assert methods.
.. versionadded:: 2.7
maxDiff~
This attribute controls the maximum length of diffs output by assert
methods that report diffs on failure. It defaults to 80*8 characters.
Assert methods affected by this attribute are
assertSequenceEqual (including all the sequence comparison
methods that delegate to it), assertDictEqual and
assertMultiLineEqual.
Setting ``maxDiff`` to None means that there is no maximum length of
diffs.
.. versionadded:: 2.7
Testing frameworks can use the following methods to collect information on
the test:
countTestCases()~
Return the number of tests represented by this test object. For
TestCase instances, this will always be ``1``.
defaultTestResult()~
Return an instance of the test result class that should be used for this
test case class (if no other result instance is provided to the
run method).
For TestCase instances, this will always be an instance of
TestResult; subclasses of TestCase should override this
as necessary.
id()~
Return a string identifying the specific test case. This is usually the
full name of the test method, including the module and class name.
shortDescription()~
Returns a description of the test, or None if no description
has been provided. The default implementation of this method
returns the first line of the test method's docstring, if available,
or None.
addTypeEqualityFunc(typeobj, function)~
Registers a type specific assertEqual equality checking
function to be called by assertEqual when both objects it has
been asked to compare are exactly {typeobj} (not subclasses).
{function} must take two positional arguments and a third msg=None
keyword argument just as assertEqual does. It must raise
``self.failureException`` when inequality between the first two
parameters is detected.
One good use of custom equality checking functions for a type
is to raise ``self.failureException`` with an error message useful
for debugging the problem by explaining the inequalities in detail.
.. versionadded:: 2.7
addCleanup(function[, {args[, }*kwargs]])~
Add a function to be called after tearDown to cleanup resources
used during the test. Functions will be called in reverse order to the
order they are added (LIFO). They are called with any arguments and
keyword arguments passed into addCleanup when they are
added.
If setUp fails, meaning that tearDown is not called,
then any cleanup functions added will still be called.
.. versionadded:: 2.7
doCleanups()~
This method is called unconditionally after tearDown, or
after setUp if setUp raises an exception.
It is responsible for calling all the cleanup functions added by
addCleanup. If you need cleanup functions to be called
{prior} to tearDown then you can call doCleanups
yourself.
doCleanups pops methods off the stack of cleanup
functions one at a time, so it can be called at any time.
.. versionadded:: 2.7
FunctionTestCase(testFunc[, setUp[, tearDown[, description]]])~
This class implements the portion of the TestCase interface which
allows the test runner to drive the test, but does not provide the methods
which test code can use to check and report errors. This is used to create
test cases using legacy test code, allowing it to be integrated into a
unittest (|py2stdlib-unittest|)-based test framework.
Grouping tests
~~~~~~~~~~~~~~
TestSuite([tests])~
This class represents an aggregation of individual tests cases and test suites.
The class presents the interface needed by the test runner to allow it to be run
as any other test case. Running a TestSuite instance is the same as
iterating over the suite, running each test individually.
If {tests} is given, it must be an iterable of individual test cases or other
test suites that will be used to build the suite initially. Additional methods
are provided to add test cases and suites to the collection later on.
TestSuite objects behave much like TestCase objects, except
they do not actually implement a test. Instead, they are used to aggregate
tests into groups of tests that should be run together. Some additional
methods are available to add tests to TestSuite instances:
TestSuite.addTest(test)~
Add a TestCase or TestSuite to the suite.
TestSuite.addTests(tests)~
Add all the tests from an iterable of TestCase and TestSuite
instances to this test suite.
This is equivalent to iterating over {tests}, calling addTest for
each element.
TestSuite shares the following methods with TestCase:
run(result)~
Run the tests associated with this suite, collecting the result into the
test result object passed as {result}. Note that unlike
TestCase.run, TestSuite.run requires the result object to
be passed in.
debug()~
Run the tests associated with this suite without collecting the
result. This allows exceptions raised by the test to be propagated to the
caller and can be used to support running tests under a debugger.
countTestCases()~
Return the number of tests represented by this test object, including all
individual tests and sub-suites.
__iter__()~
Tests grouped by a TestSuite are always accessed by iteration.
Subclasses can lazily provide tests by overriding __iter__. Note
that this method maybe called several times on a single suite
(for example when counting tests or comparing for equality)
so the tests returned must be the same for repeated iterations.
.. versionchanged:: 2.7
In earlier versions the TestSuite accessed tests directly rather
than through iteration, so overriding __iter__ wasn't sufficient
for providing tests.
In the typical usage of a TestSuite object, the run method
is invoked by a TestRunner rather than by the end-user test harness.
Loading and running tests
~~~~~~~~~~~~~~~~~~~~~~~~~
TestLoader()~
The TestLoader class is used to create test suites from classes and
modules. Normally, there is no need to create an instance of this class; the
unittest (|py2stdlib-unittest|) module provides an instance that can be shared as
``unittest.defaultTestLoader``. Using a subclass or instance, however, allows
customization of some configurable properties.
TestLoader objects have the following methods:
loadTestsFromTestCase(testCaseClass)~
Return a suite of all tests cases contained in the TestCase\ -derived
testCaseClass.
loadTestsFromModule(module)~
Return a suite of all tests cases contained in the given module. This
method searches {module} for classes derived from TestCase and
creates an instance of the class for each test method defined for the
class.
.. note:: >
While using a hierarchy of TestCase\ -derived classes can be
convenient in sharing fixtures and helper functions, defining test
methods on base classes that are not intended to be instantiated
directly does not play well with this method. Doing so, however, can
be useful when the fixtures are different and defined in subclasses.
<
If a module provides a ``load_tests`` function it will be called to
load the tests. This allows modules to customize test loading.
This is the `load_tests protocol`_.
.. versionchanged:: 2.7
Support for ``load_tests`` added.
loadTestsFromName(name[, module])~
Return a suite of all tests cases given a string specifier.
The specifier {name} is a "dotted name" that may resolve either to a
module, a test case class, a test method within a test case class, a
TestSuite instance, or a callable object which returns a
TestCase or TestSuite instance. These checks are
applied in the order listed here; that is, a method on a possible test
case class will be picked up as "a test method within a test case class",
rather than "a callable object".
For example, if you have a module SampleTests containing a
TestCase\ -derived class SampleTestCase with three test
methods (test_one, test_two, and test_three), the
specifier ``'SampleTests.SampleTestCase'`` would cause this method to
return a suite which will run all three test methods. Using the specifier
``'SampleTests.SampleTestCase.test_two'`` would cause it to return a test
suite which will run only the test_two test method. The specifier
can refer to modules and packages which have not been imported; they will
be imported as a side-effect.
The method optionally resolves {name} relative to the given {module}.
loadTestsFromNames(names[, module])~
Similar to loadTestsFromName, but takes a sequence of names rather
than a single name. The return value is a test suite which supports all
the tests defined for each name.
getTestCaseNames(testCaseClass)~
Return a sorted sequence of method names found within {testCaseClass};
this should be a subclass of TestCase.
discover(start_dir, pattern='test*.py', top_level_dir=None)~
Find and return all test modules from the specified start directory,
recursing into subdirectories to find them. Only test files that match
{pattern} will be loaded. (Using shell style pattern matching.) Only
module names that are importable (i.e. are valid Python identifiers) will
be loaded.
All test modules must be importable from the top level of the project. If
the start directory is not the top level directory then the top level
directory must be specified separately.
If importing a module fails, for example due to a syntax error, then this
will be recorded as a single error and discovery will continue.
If a test package name (directory with __init__.py) matches the
pattern then the package will be checked for a ``load_tests``
function. If this exists then it will be called with {loader}, {tests},
{pattern}.
If load_tests exists then discovery does {not} recurse into the package,
``load_tests`` is responsible for loading all tests in the package.
The pattern is deliberately not stored as a loader attribute so that
packages can continue discovery themselves. {top_level_dir} is stored so
``load_tests`` does not need to pass this argument in to
``loader.discover()``.
{start_dir} can be a dotted module name as well as a directory.
.. versionadded:: 2.7
The following attributes of a TestLoader can be configured either by
subclassing or assignment on an instance:
testMethodPrefix~
String giving the prefix of method names which will be interpreted as test
methods. The default value is ``'test'``.
This affects getTestCaseNames and all the loadTestsFrom\*
methods.
sortTestMethodsUsing~
Function to be used to compare method names when sorting them in
getTestCaseNames and all the loadTestsFrom\* methods. The
default value is the built-in cmp function; the attribute can also
be set to None to disable the sort.
suiteClass~
Callable object that constructs a test suite from a list of tests. No
methods on the resulting object are needed. The default value is the
TestSuite class.
This affects all the loadTestsFrom\* methods.
TestResult~
This class is used to compile information about which tests have succeeded
and which have failed.
A TestResult object stores the results of a set of tests. The
TestCase and TestSuite classes ensure that results are
properly recorded; test authors do not need to worry about recording the
outcome of tests.
Testing frameworks built on top of unittest (|py2stdlib-unittest|) may want access to the
TestResult object generated by running a set of tests for reporting
purposes; a TestResult instance is returned by the
TestRunner.run method for this purpose.
TestResult instances have the following attributes that will be of
interest when inspecting the results of running a set of tests:
errors~
A list containing 2-tuples of TestCase instances and strings
holding formatted tracebacks. Each tuple represents a test which raised an
unexpected exception.
.. versionchanged:: 2.2
Contains formatted tracebacks instead of sys.exc_info results.
failures~
A list containing 2-tuples of TestCase instances and strings
holding formatted tracebacks. Each tuple represents a test where a failure
was explicitly signalled using the TestCase.fail\* or
TestCase.assert\* methods.
.. versionchanged:: 2.2
Contains formatted tracebacks instead of sys.exc_info results.
skipped~
A list containing 2-tuples of TestCase instances and strings
holding the reason for skipping the test.
.. versionadded:: 2.7
expectedFailures~
A list contaning 2-tuples of TestCase instances and strings
holding formatted tracebacks. Each tuple represents a expected failures
of the test case.
unexpectedSuccesses~
A list containing TestCase instances that were marked as expected
failures, but succeeded.
shouldStop~
Set to ``True`` when the execution of tests should stop by stop.
testsRun~
The total number of tests run so far.
buffer~
If set to true, ``sys.stdout`` and ``sys.stderr`` will be buffered in between
startTest and stopTest being called. Collected output will
only be echoed onto the real ``sys.stdout`` and ``sys.stderr`` if the test
fails or errors. Any output is also attached to the failure / error message.
.. versionadded:: 2.7
failfast~
If set to true stop will be called on the first failure or error,
halting the test run.
.. versionadded:: 2.7
wasSuccessful()~
Return True if all tests run so far have passed, otherwise returns
False.
stop()~
This method can be called to signal that the set of tests being run should
be aborted by setting the shouldStop attribute to True.
TestRunner objects should respect this flag and return without
running any additional tests.
For example, this feature is used by the TextTestRunner class to
stop the test framework when the user signals an interrupt from the
keyboard. Interactive tools which provide TestRunner
implementations can use this in a similar manner.
The following methods of the TestResult class are used to maintain
the internal data structures, and may be extended in subclasses to support
additional reporting requirements. This is particularly useful in building
tools which support interactive reporting while tests are being run.
startTest(test)~
Called when the test case {test} is about to be run.
stopTest(test)~
Called after the test case {test} has been executed, regardless of the
outcome.
startTestRun(test)~
Called once before any tests are executed.
.. versionadded:: 2.7
stopTestRun(test)~
Called once after all tests are executed.
.. versionadded:: 2.7
addError(test, err)~
Called when the test case {test} raises an unexpected exception {err} is a
tuple of the form returned by sys.exc_info: ``(type, value,
traceback)``.
The default implementation appends a tuple ``(test, formatted_err)`` to
the instance's errors attribute, where {formatted_err} is a
formatted traceback derived from {err}.
addFailure(test, err)~
Called when the test case {test} signals a failure. {err} is a tuple of
the form returned by sys.exc_info: ``(type, value, traceback)``.
The default implementation appends a tuple ``(test, formatted_err)`` to
the instance's failures attribute, where {formatted_err} is a
formatted traceback derived from {err}.
addSuccess(test)~
Called when the test case {test} succeeds.
The default implementation does nothing.
addSkip(test, reason)~
Called when the test case {test} is skipped. {reason} is the reason the
test gave for skipping.
The default implementation appends a tuple ``(test, reason)`` to the
instance's skipped attribute.
addExpectedFailure(test, err)~
Called when the test case {test} fails, but was marked with the
expectedFailure decorator.
The default implementation appends a tuple ``(test, formatted_err)`` to
the instance's expectedFailures attribute, where {formatted_err}
is a formatted traceback derived from {err}.
addUnexpectedSuccess(test)~
Called when the test case {test} was marked with the
expectedFailure decorator, but succeeded.
The default implementation appends the test to the instance's
unexpectedSuccesses attribute.
TextTestResult(stream, descriptions, verbosity)~
A concrete implementation of TestResult used by the
TextTestRunner.
.. versionadded:: 2.7
This class was previously named ``_TextTestResult``. The old name still
exists as an alias but is deprecated.
defaultTestLoader~
Instance of the TestLoader class intended to be shared. If no
customization of the TestLoader is needed, this instance can be used
instead of repeatedly creating new instances.
TextTestRunner([stream[, descriptions[, verbosity], [resultclass]]])~
A basic test runner implementation which prints results on standard error. It
has a few configurable parameters, but is essentially very simple. Graphical
applications which run test suites should provide alternate implementations.
_makeResult()~
This method returns the instance of ``TestResult`` used by run.
It is not intended to be called directly, but can be overridden in
subclasses to provide a custom ``TestResult``.
``_makeResult()`` instantiates the class or callable passed in the
``TextTestRunner`` constructor as the ``resultclass`` argument. It
defaults to TextTestResult if no ``resultclass`` is provided.
The result class is instantiated with the following arguments:: >
stream, descriptions, verbosity
<
main([module[, defaultTest[, argv[, testRunner[, testLoader[, exit[, verbosity[, failfast[, catchbreak[,buffer]]]]]]]]]])~
A command-line program that runs a set of tests; this is primarily for making
test modules conveniently executable. The simplest use for this function is to
include the following line at the end of a test script:: >
if __name__ == '__main__':
unittest.main()
<
You can run tests with more detailed information by passing in the verbosity
argument:: >
if __name__ == '__main__':
unittest.main(verbosity=2)
<
The {testRunner} argument can either be a test runner class or an already
created instance of it. By default ``main`` calls sys.exit with
an exit code indicating success or failure of the tests run.
``main`` supports being used from the interactive interpreter by passing in the
argument ``exit=False``. This displays the result on standard output without
calling sys.exit:: >
>>> from unittest import main
>>> main(module='test_module', exit=False)
<
The ``failfast``, ``catchbreak`` and ``buffer`` parameters have the same
effect as the `failfast, catch and buffer command line options`_.
Calling ``main`` actually returns an instance of the ``TestProgram`` class.
This stores the result of the tests run as the ``result`` attribute.
.. versionchanged:: 2.7
The ``exit``, ``verbosity``, ``failfast``, ``catchbreak`` and ``buffer``
parameters were added.
load_tests Protocol
###################
.. versionadded:: 2.7
Modules or packages can customize how tests are loaded from them during normal
test runs or test discovery by implementing a function called ``load_tests``.
If a test module defines ``load_tests`` it will be called by
TestLoader.loadTestsFromModule with the following arguments:: >
load_tests(loader, standard_tests, None)
<
It should return a TestSuite.
{loader} is the instance of TestLoader doing the loading.
{standard_tests} are the tests that would be loaded by default from the
module. It is common for test modules to only want to add or remove tests
from the standard set of tests.
The third argument is used when loading packages as part of test discovery.
A typical ``load_tests`` function that loads tests from a specific set of
TestCase classes may look like:: >
test_cases = (TestCase1, TestCase2, TestCase3)
def load_tests(loader, tests, pattern):
suite = TestSuite()
for test_class in test_cases:
tests = loader.loadTestsFromTestCase(test_class)
suite.addTests(tests)
return suite
<
If discovery is started, either from the command line or by calling
TestLoader.discover, with a pattern that matches a package
name then the package __init__.py will be checked for ``load_tests``.
.. note::
The default pattern is 'test*.py'. This matches all Python files
that start with 'test' but {won't} match any test directories.
A pattern like 'test*' will match test packages as well as
modules.
If the package __init__.py defines ``load_tests`` then it will be
called and discovery not continued into the package. ``load_tests``
is called with the following arguments:: >
load_tests(loader, standard_tests, pattern)
<
This should return a TestSuite representing all the tests
from the package. (``standard_tests`` will only contain tests
collected from __init__.py.)
Because the pattern is passed into ``load_tests`` the package is free to
continue (and potentially modify) test discovery. A 'do nothing'
``load_tests`` function for a test package would look like:: >
def load_tests(loader, standard_tests, pattern):
# top level directory cached on loader instance
this_dir = os.path.dirname(__file__)
package_tests = loader.discover(start_dir=this_dir, pattern=pattern)
standard_tests.addTests(package_tests)
return standard_tests
<
Class and Module Fixtures
Class and module level fixtures are implemented in TestSuite. When
the test suite encounters a test from a new class then tearDownClass
from the previous class (if there is one) is called, followed by
setUpClass from the new class.
Similarly if a test is from a different module from the previous test then
``tearDownModule`` from the previous module is run, followed by
``setUpModule`` from the new module.
After all the tests have run the final ``tearDownClass`` and
``tearDownModule`` are run.
Note that shared fixtures do not play well with [potential] features like test
parallelization and they break test isolation. They should be used with care.
The default ordering of tests created by the unittest test loaders is to group
all tests from the same modules and classes together. This will lead to
``setUpClass`` / ``setUpModule`` (etc) being called exactly once per class and
module. If you randomize the order, so that tests from different modules and
classes are adjacent to each other, then these shared fixture functions may be
called multiple times in a single test run.
Shared fixtures are not intended to work with suites with non-standard
ordering. A ``BaseTestSuite`` still exists for frameworks that don't want to
support shared fixtures.
If there are any exceptions raised during one of the shared fixture functions
the test is reported as an error. Because there is no corresponding test
instance an ``_ErrorHolder`` object (that has the same interface as a
TestCase) is created to represent the error. If you are just using
the standard unittest test runner then this detail doesn't matter, but if you
are a framework author it may be relevant.
setUpClass and tearDownClass
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
These must be implemented as class methods:: >
import unittest
class Test(unittest.TestCase):
@classmethod
def setUpClass(cls):
cls._connection = createExpensiveConnectionObject()
@classmethod
def tearDownClass(cls):
cls._connection.destroy()
<
If you want the ``setUpClass`` and ``tearDownClass`` on base classes called
then you must call up to them yourself. The implementations in
TestCase are empty.
If an exception is raised during a ``setUpClass`` then the tests in the class
are not run and the ``tearDownClass`` is not run. Skipped classes will not
have ``setUpClass`` or ``tearDownClass`` run. If the exception is a
``SkipTest`` exception then the class will be reported as having been skipped
instead of as an error.
setUpModule and tearDownModule
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
These should be implemented as functions:: >
def setUpModule():
createConnection()
def tearDownModule():
closeConnection()
<
If an exception is raised in a ``setUpModule`` then none of the tests in the
module will be run and the ``tearDownModule`` will not be run. If the exception is a
``SkipTest`` exception then the module will be reported as having been skipped
instead of as an error.
Signal Handling
---------------
The -c/--catch command line option to unittest, along with the ``catchbreak``
parameter to unittest.main(), provide more friendly handling of
control-C during a test run. With catch break behavior enabled control-C will
allow the currently running test to complete, and the test run will then end
and report all the results so far. A second control-c will raise a
KeyboardInterrupt in the usual way.
The control-c handling signal handler attempts to remain compatible with code or
tests that install their own signal.SIGINT handler. If the ``unittest``
handler is called but {isn't} the installed signal.SIGINT handler,
i.e. it has been replaced by the system under test and delegated to, then it
calls the default handler. This will normally be the expected behavior by code
that replaces an installed handler and delegates to it. For individual tests
that need ``unittest`` control-c handling disabled the removeHandler
decorator can be used.
There are a few utility functions for framework authors to enable control-c
handling functionality within test frameworks.
installHandler()~
Install the control-c handler. When a signal.SIGINT is received
(usually in response to the user pressing control-c) all registered results
have TestResult.stop called.
.. versionadded:: 2.7
registerResult(result)~
Register a TestResult object for control-c handling. Registering a
result stores a weak reference to it, so it doesn't prevent the result from
being garbage collected.
Registering a TestResult object has no side-effects if control-c
handling is not enabled, so test frameworks can unconditionally register
all results they create independently of whether or not handling is enabled.
.. versionadded:: 2.7
removeResult(result)~
Remove a registered result. Once a result has been removed then
TestResult.stop will no longer be called on that result object in
response to a control-c.
.. versionadded:: 2.7
removeHandler(function=None)~
When called without arguments this function removes the control-c handler
if it has been installed. This function can also be used as a test decorator
to temporarily remove the handler whilst the test is being executed:: >
@unittest.removeHandler
def test_signal_handling(self):
...
<
.. versionadded:: 2.7
==============================================================================
*py2stdlib-urllib*
urllib~
:synopsis: Open an arbitrary network resource by URL (requires sockets).
.. note::
The urllib (|py2stdlib-urllib|) module has been split into parts and renamed in
Python 3.0 to urllib.request, urllib.parse,
and urllib.error. The 2to3 tool will automatically adapt
imports when converting your sources to 3.0.
Also note that the urllib.urlopen function has been removed in
Python 3.0 in favor of urllib2.urlopen.
.. index::
single: WWW
single: World Wide Web
single: URL
This module provides a high-level interface for fetching data across the World
Wide Web. In particular, the urlopen function is similar to the
built-in function open, but accepts Universal Resource Locators (URLs)
instead of filenames. Some restrictions apply --- it can only open URLs for
reading, and no seek operations are available.
High-level interface
--------------------
urlopen(url[, data[, proxies]])~
Open a network object denoted by a URL for reading. If the URL does not have a
scheme identifier, or if it has file: as its scheme identifier, this
opens a local file (without universal newlines); otherwise it opens a socket to
a server somewhere on the network. If the connection cannot be made the
IOError exception is raised. If all went well, a file-like object is
returned. This supports the following methods: read, readline (|py2stdlib-readline|),
readlines, fileno, close, info, getcode and
geturl. It also has proper support for the iterator protocol. One
caveat: the read method, if the size argument is omitted or negative,
may not read until the end of the data stream; there is no good way to determine
that the entire stream from a socket has been read in the general case.
Except for the info, getcode and geturl methods,
these methods have the same interface as for file objects --- see section
bltin-file-objects in this manual. (It is not a built-in file object,
however, so it can't be used at those few places where a true built-in file
object is required.)
.. index:: module: mimetools
The info method returns an instance of the class
mimetools.Message containing meta-information associated with the
URL. When the method is HTTP, these headers are those returned by the server
at the head of the retrieved HTML page (including Content-Length and
Content-Type). When the method is FTP, a Content-Length header will be
present if (as is now usual) the server passed back a file length in response
to the FTP retrieval request. A Content-Type header will be present if the
MIME type can be guessed. When the method is local-file, returned headers
will include a Date representing the file's last-modified time, a
Content-Length giving file size, and a Content-Type containing a guess at the
file's type. See also the description of the mimetools (|py2stdlib-mimetools|) module.
The geturl method returns the real URL of the page. In some cases, the
HTTP server redirects a client to another URL. The urlopen function
handles this transparently, but in some cases the caller needs to know which URL
the client was redirected to. The geturl method can be used to get at
this redirected URL.
The getcode method returns the HTTP status code that was sent with the
response, or ``None`` if the URL is no HTTP URL.
If the {url} uses the http: scheme identifier, the optional {data}
argument may be given to specify a ``POST`` request (normally the request type
is ``GET``). The {data} argument must be in standard
application/x-www-form-urlencoded format; see the urlencode
function below.
The urlopen function works transparently with proxies which do not
require authentication. In a Unix or Windows environment, set the
http_proxy, or ftp_proxy environment variables to a URL that
identifies the proxy server before starting the Python interpreter. For example
(the ``'%'`` is the command prompt):: >
% http_proxy="http://www.someproxy.com:3128"
% export http_proxy
% python
...
<
The no_proxy environment variable can be used to specify hosts which
shouldn't be reached via proxy; if set, it should be a comma-separated list
of hostname suffixes, optionally with ``:port`` appended, for example
``cern.ch,ncsa.uiuc.edu,some.host:8080``.
In a Windows environment, if no proxy environment variables are set, proxy
settings are obtained from the registry's Internet Settings section.
.. index:: single: Internet Config
In a Mac OS X environment, urlopen will retrieve proxy information
from the OS X System Configuration Framework, which can be managed with
Network System Preferences panel.
Alternatively, the optional {proxies} argument may be used to explicitly specify
proxies. It must be a dictionary mapping scheme names to proxy URLs, where an
empty dictionary causes no proxies to be used, and ``None`` (the default value)
causes environmental proxy settings to be used as discussed above. For
example:: >
# Use http://www.someproxy.com:3128 for http proxying
proxies = {'http': 'http://www.someproxy.com:3128'}
filehandle = urllib.urlopen(some_url, proxies=proxies)
# Don't use any proxies
filehandle = urllib.urlopen(some_url, proxies={})
# Use proxies from environment - both versions are equivalent
filehandle = urllib.urlopen(some_url, proxies=None)
filehandle = urllib.urlopen(some_url)
<
Proxies which require authentication for use are not currently supported; this
is considered an implementation limitation.
.. versionchanged:: 2.3
Added the {proxies} support.
.. versionchanged:: 2.6
Added getcode to returned object and support for the
no_proxy environment variable.
2.6~
The urlopen function has been removed in Python 3.0 in favor
of urllib2.urlopen.
urlretrieve(url[, filename[, reporthook[, data]]])~
Copy a network object denoted by a URL to a local file, if necessary. If the URL
points to a local file, or a valid cached copy of the object exists, the object
is not copied. Return a tuple ``(filename, headers)`` where {filename} is the
local file name under which the object can be found, and {headers} is whatever
the info method of the object returned by urlopen returned (for
a remote object, possibly cached). Exceptions are the same as for
urlopen.
The second argument, if present, specifies the file location to copy to (if
absent, the location will be a tempfile with a generated name). The third
argument, if present, is a hook function that will be called once on
establishment of the network connection and once after each block read
thereafter. The hook will be passed three arguments; a count of blocks
transferred so far, a block size in bytes, and the total size of the file. The
third argument may be ``-1`` on older FTP servers which do not return a file
size in response to a retrieval request.
If the {url} uses the http: scheme identifier, the optional {data}
argument may be given to specify a ``POST`` request (normally the request type
is ``GET``). The {data} argument must in standard
application/x-www-form-urlencoded format; see the urlencode
function below.
.. versionchanged:: 2.5
urlretrieve will raise ContentTooShortError when it detects that
the amount of data available was less than the expected amount (which is the
size reported by a {Content-Length} header). This can occur, for example, when
the download is interrupted.
The {Content-Length} is treated as a lower bound: if there's more data to read,
urlretrieve reads more data, but if less data is available, it raises the
exception.
You can still retrieve the downloaded data in this case, it is stored in the
content attribute of the exception instance.
If no {Content-Length} header was supplied, urlretrieve can not check the size
of the data it has downloaded, and just returns it. In this case you just have
to assume that the download was successful.
_urlopener~
The public functions urlopen and urlretrieve create an instance
of the FancyURLopener class and use it to perform their requested
actions. To override this functionality, programmers can create a subclass of
URLopener or FancyURLopener, then assign an instance of that
class to the ``urllib._urlopener`` variable before calling the desired function.
For example, applications may want to specify a different
User-Agent header than URLopener defines. This can be
accomplished with the following code:: >
import urllib
class AppURLopener(urllib.FancyURLopener):
version = "App/1.7"
urllib._urlopener = AppURLopener()
<
urlcleanup()~
Clear the cache that may have been built up by previous calls to
urlretrieve.
Utility functions
-----------------
quote(string[, safe])~
Replace special characters in {string} using the ``%xx`` escape. Letters,
digits, and the characters ``'_.-'`` are never quoted. By default, this
function is intended for quoting the path section of the URL.The optional
{safe} parameter specifies additional characters that should not be quoted
--- its default value is ``'/'``.
Example: ``quote('/~connolly/')`` yields ``'/%7econnolly/'``.
quote_plus(string[, safe])~
Like quote, but also replaces spaces by plus signs, as required for
quoting HTML form values when building up a query string to go into a URL.
Plus signs in the original string are escaped unless they are included in
{safe}. It also does not have {safe} default to ``'/'``.
unquote(string)~
Replace ``%xx`` escapes by their single-character equivalent.
Example: ``unquote('/%7Econnolly/')`` yields ``'/~connolly/'``.
unquote_plus(string)~
Like unquote, but also replaces plus signs by spaces, as required for
unquoting HTML form values.
urlencode(query[, doseq])~
Convert a mapping object or a sequence of two-element tuples to a
"url-encoded" string, suitable to pass to urlopen above as the
optional {data} argument. This is useful to pass a dictionary of form
fields to a ``POST`` request. The resulting string is a series of
``key=value`` pairs separated by ``'&'`` characters, where both {key} and
{value} are quoted using quote_plus above. When a sequence of
two-element tuples is used as the {query} argument, the first element of
each tuple is a key and the second is a value. The value element in itself
can be a sequence and in that case, if the optional parameter {doseq} is
evaluates to {True}, individual ``key=value`` pairs separated by ``'&'`` are
generated for each element of the value sequence for the key. The order of
parameters in the encoded string will match the order of parameter tuples in
the sequence. The urlparse (|py2stdlib-urlparse|) module provides the functions
parse_qs and parse_qsl which are used to parse query strings
into Python data structures.
pathname2url(path)~
Convert the pathname {path} from the local syntax for a path to the form used in
the path component of a URL. This does not produce a complete URL. The return
value will already be quoted using the quote function.
url2pathname(path)~
Convert the path component {path} from an encoded URL to the local syntax for a
path. This does not accept a complete URL. This function uses unquote
to decode {path}.
getproxies()~
This helper function returns a dictionary of scheme to proxy server URL
mappings. It scans the environment for variables named ``<scheme>_proxy``
for all operating systems first, and when it cannot find it, looks for proxy
information from Mac OSX System Configuration for Mac OS X and Windows
Systems Registry for Windows.
URL Opener objects
------------------
URLopener([proxies[, {}x509]])~
Base class for opening and reading URLs. Unless you need to support opening
objects using schemes other than http:, ftp:, or file:,
you probably want to use FancyURLopener.
By default, the URLopener class sends a User-Agent header
of ``urllib/VVV``, where {VVV} is the urllib (|py2stdlib-urllib|) version number.
Applications can define their own User-Agent header by subclassing
URLopener or FancyURLopener and setting the class attribute
version to an appropriate string value in the subclass definition.
The optional {proxies} parameter should be a dictionary mapping scheme names to
proxy URLs, where an empty dictionary turns proxies off completely. Its default
value is ``None``, in which case environmental proxy settings will be used if
present, as discussed in the definition of urlopen, above.
Additional keyword parameters, collected in {x509}, may be used for
authentication of the client when using the https: scheme. The keywords
{key_file} and {cert_file} are supported to provide an SSL key and certificate;
both are needed to support client authentication.
URLopener objects will raise an IOError exception if the server
returns an error code.
open(fullurl[, data])~
Open {fullurl} using the appropriate protocol. This method sets up cache and
proxy information, then calls the appropriate open method with its input
arguments. If the scheme is not recognized, open_unknown is called.
The {data} argument has the same meaning as the {data} argument of
urlopen.
open_unknown(fullurl[, data])~
Overridable interface to open unknown URL types.
retrieve(url[, filename[, reporthook[, data]]])~
Retrieves the contents of {url} and places it in {filename}. The return value
is a tuple consisting of a local filename and either a
mimetools.Message object containing the response headers (for remote
URLs) or ``None`` (for local URLs). The caller must then open and read the
contents of {filename}. If {filename} is not given and the URL refers to a
local file, the input filename is returned. If the URL is non-local and
{filename} is not given, the filename is the output of tempfile.mktemp
with a suffix that matches the suffix of the last path component of the input
URL. If {reporthook} is given, it must be a function accepting three numeric
parameters. It will be called after each chunk of data is read from the
network. {reporthook} is ignored for local URLs.
If the {url} uses the http: scheme identifier, the optional {data}
argument may be given to specify a ``POST`` request (normally the request type
is ``GET``). The {data} argument must in standard
application/x-www-form-urlencoded format; see the urlencode
function below.
version~
Variable that specifies the user agent of the opener object. To get
urllib (|py2stdlib-urllib|) to tell servers that it is a particular user agent, set this in a
subclass as a class variable or in the constructor before calling the base
constructor.
FancyURLopener(...)~
FancyURLopener subclasses URLopener providing default handling
for the following HTTP response codes: 301, 302, 303, 307 and 401. For the 30x
response codes listed above, the Location header is used to fetch
the actual URL. For 401 response codes (authentication required), basic HTTP
authentication is performed. For the 30x response codes, recursion is bounded
by the value of the {maxtries} attribute, which defaults to 10.
For all other response codes, the method http_error_default is called
which you can override in subclasses to handle the error appropriately.
.. note:: >
According to the letter of 2616, 301 and 302 responses to POST requests
must not be automatically redirected without confirmation by the user. In
reality, browsers do allow automatic redirection of these responses, changing
the POST to a GET, and urllib (|py2stdlib-urllib|) reproduces this behaviour.
<
The parameters to the constructor are the same as those for URLopener.
.. note:: >
When performing basic authentication, a FancyURLopener instance calls
its prompt_user_passwd method. The default implementation asks the
users for the required information on the controlling terminal. A subclass may
override this method to support more appropriate behavior if needed.
The FancyURLopener class offers one additional method that should be
overloaded to provide the appropriate behavior:
<
prompt_user_passwd(host, realm)~
Return information needed to authenticate the user at the given host in the
specified security realm. The return value should be a tuple, ``(user,
password)``, which can be used for basic authentication.
The implementation prompts for this information on the terminal; an application
should override this method to use an appropriate interaction model in the local
environment.
ContentTooShortError(msg[, content])~
This exception is raised when the urlretrieve function detects that the
amount of the downloaded data is less than the expected amount (given by the
{Content-Length} header). The content attribute stores the downloaded
(and supposedly truncated) data.
.. versionadded:: 2.5
urllib (|py2stdlib-urllib|) Restrictions
--------------------------
.. index::
pair: HTTP; protocol
pair: FTP; protocol
* Currently, only the following protocols are supported: HTTP, (versions 0.9 and
1.0), FTP, and local files.
* The caching feature of urlretrieve has been disabled until I find the
time to hack proper processing of Expiration time headers.
* There should be a function to query whether a particular URL is in the cache.
* For backward compatibility, if a URL appears to point to a local file but the
file can't be opened, the URL is re-interpreted using the FTP protocol. This
can sometimes cause confusing error messages.
* The urlopen and urlretrieve functions can cause arbitrarily
long delays while waiting for a network connection to be set up. This means
that it is difficult to build an interactive Web client using these functions
without using threads.
.. index::
single: HTML
pair: HTTP; protocol
module: htmllib
* The data returned by urlopen or urlretrieve is the raw data
returned by the server. This may be binary data (such as an image), plain text
or (for example) HTML. The HTTP protocol provides type information in the reply
header, which can be inspected by looking at the Content-Type
header. If the returned data is HTML, you can use the module htmllib (|py2stdlib-htmllib|) to
parse it.
.. index:: single: FTP
* The code handling the FTP protocol cannot differentiate between a file and a
directory. This can lead to unexpected behavior when attempting to read a URL
that points to a file that is not accessible. If the URL ends in a ``/``, it is
assumed to refer to a directory and will be handled accordingly. But if an
attempt to read a file leads to a 550 error (meaning the URL cannot be found or
is not accessible, often for permission reasons), then the path is treated as a
directory in order to handle the case when a directory is specified by a URL but
the trailing ``/`` has been left off. This can cause misleading results when
you try to fetch a file whose read permissions make it inaccessible; the FTP
code will try to read it, fail with a 550 error, and then perform a directory
listing for the unreadable file. If fine-grained control is needed, consider
using the ftplib (|py2stdlib-ftplib|) module, subclassing FancyURLOpener, or changing
{_urlopener} to meet your needs.
* This module does not support the use of proxies which require authentication.
This may be implemented in the future.
.. index:: module: urlparse
* Although the urllib (|py2stdlib-urllib|) module contains (undocumented) routines to parse
and unparse URL strings, the recommended interface for URL manipulation is in
module urlparse (|py2stdlib-urlparse|).
Examples
--------
Here is an example session that uses the ``GET`` method to retrieve a URL
containing parameters:: >
>>> import urllib
>>> params = urllib.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})
>>> f = urllib.urlopen("http://www.musi-cal.com/cgi-bin/query?%s" % params)
>>> print f.read()
<
The following example uses the ``POST`` method instead::
>>> import urllib
>>> params = urllib.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})
>>> f = urllib.urlopen("http://www.musi-cal.com/cgi-bin/query", params)
>>> print f.read()
The following example uses an explicitly specified HTTP proxy, overriding
environment settings:: >
>>> import urllib
>>> proxies = {'http': 'http://proxy.example.com:8080/'}
>>> opener = urllib.FancyURLopener(proxies)
>>> f = opener.open("http://www.python.org")
>>> f.read()
<
The following example uses no proxies at all, overriding environment settings::
>>> import urllib
>>> opener = urllib.FancyURLopener({})
>>> f = opener.open("http://www.python.org/")
>>> f.read()
==============================================================================
*py2stdlib-urllib2*
urllib2~
:synopsis: Next generation URL opening library.
.. note::
The urllib2 (|py2stdlib-urllib2|) module has been split across several modules in
Python 3.0 named urllib.request and urllib.error.
The 2to3 tool will automatically adapt imports when converting
your sources to 3.0.
The urllib2 (|py2stdlib-urllib2|) module defines functions and classes which help in opening
URLs (mostly HTTP) in a complex world --- basic and digest authentication,
redirections, cookies and more.
The urllib2 (|py2stdlib-urllib2|) module defines the following functions:
urlopen(url[, data][, timeout])~
Open the URL {url}, which can be either a string or a Request object.
{data} may be a string specifying additional data to send to the server, or
``None`` if no such data is needed. Currently HTTP requests are the only ones
that use {data}; the HTTP request will be a POST instead of a GET when the
{data} parameter is provided. {data} should be a buffer in the standard
application/x-www-form-urlencoded format. The
urllib.urlencode function takes a mapping or sequence of 2-tuples and
returns a string in this format.
The optional {timeout} parameter specifies a timeout in seconds for blocking
operations like the connection attempt (if not specified, the global default
timeout setting will be used). This actually only works for HTTP, HTTPS,
FTP and FTPS connections.
This function returns a file-like object with two additional methods:
* geturl --- return the URL of the resource retrieved, commonly used to
determine if a redirect was followed
* info --- return the meta-information of the page, such as headers,
in the form of an mimetools.Message instance
(see `Quick Reference to HTTP Headers <http://www.cs.tut.fi/~jkorpela/http.html>`_)
Raises URLError on errors.
Note that ``None`` may be returned if no handler handles the request (though the
default installed global OpenerDirector uses UnknownHandler to
ensure this never happens).
In addition, default installed ProxyHandler makes sure the requests
are handled through the proxy when they are set.
.. versionchanged:: 2.6
{timeout} was added.
install_opener(opener)~
Install an OpenerDirector instance as the default global opener.
Installing an opener is only necessary if you want urlopen to use that opener;
otherwise, simply call OpenerDirector.open instead of urlopen.
The code does not check for a real OpenerDirector, and any class with
the appropriate interface will work.
build_opener([handler, ...])~
Return an OpenerDirector instance, which chains the handlers in the
order given. {handler}\s can be either instances of BaseHandler, or
subclasses of BaseHandler (in which case it must be possible to call
the constructor without any parameters). Instances of the following classes
will be in front of the {handler}\s, unless the {handler}\s contain them,
instances of them or subclasses of them: ProxyHandler,
UnknownHandler, HTTPHandler, HTTPDefaultErrorHandler,
HTTPRedirectHandler, FTPHandler, FileHandler,
HTTPErrorProcessor.
If the Python installation has SSL support (i.e., if the ssl (|py2stdlib-ssl|) module can be imported),
HTTPSHandler will also be added.
Beginning in Python 2.3, a BaseHandler subclass may also change its
handler_order member variable to modify its position in the handlers
list.
The following exceptions are raised as appropriate:
URLError~
The handlers raise this exception (or derived exceptions) when they run into a
problem. It is a subclass of IOError.
reason~
The reason for this error. It can be a message string or another exception
instance (socket.error for remote URLs, OSError for local
URLs).
HTTPError~
Though being an exception (a subclass of URLError), an HTTPError
can also function as a non-exceptional file-like return value (the same thing
that urlopen returns). This is useful when handling exotic HTTP
errors, such as requests for authentication.
code~
An HTTP status code as defined in `RFC 2616 <http://www.faqs.org/rfcs/rfc2616.html>`_.
This numeric value corresponds to a value found in the dictionary of
codes as found in BaseHTTPServer.BaseHTTPRequestHandler.responses.
The following classes are provided:
Request(url[, data][, headers][, origin_req_host][, unverifiable])~
This class is an abstraction of a URL request.
{url} should be a string containing a valid URL.
{data} may be a string specifying additional data to send to the server, or
``None`` if no such data is needed. Currently HTTP requests are the only ones
that use {data}; the HTTP request will be a POST instead of a GET when the
{data} parameter is provided. {data} should be a buffer in the standard
application/x-www-form-urlencoded format. The
urllib.urlencode function takes a mapping or sequence of 2-tuples and
returns a string in this format.
{headers} should be a dictionary, and will be treated as if add_header
was called with each key and value as arguments. This is often used to "spoof"
the ``User-Agent`` header, which is used by a browser to identify itself --
some HTTP servers only allow requests coming from common browsers as opposed
to scripts. For example, Mozilla Firefox may identify itself as ``"Mozilla/5.0
(X11; U; Linux i686) Gecko/20071127 Firefox/2.0.0.11"``, while urllib2 (|py2stdlib-urllib2|)'s
default user agent string is ``"Python-urllib/2.6"`` (on Python 2.6).
The final two arguments are only of interest for correct handling of third-party
HTTP cookies:
{origin_req_host} should be the request-host of the origin transaction, as
defined by 2965. It defaults to ``cookielib.request_host(self)``. This
is the host name or IP address of the original request that was initiated by the
user. For example, if the request is for an image in an HTML document, this
should be the request-host of the request for the page containing the image.
{unverifiable} should indicate whether the request is unverifiable, as defined
by RFC 2965. It defaults to False. An unverifiable request is one whose URL
the user did not have the option to approve. For example, if the request is for
an image in an HTML document, and the user had no option to approve the
automatic fetching of the image, this should be true.
OpenerDirector()~
The OpenerDirector class opens URLs via BaseHandler\ s chained
together. It manages the chaining of handlers, and recovery from errors.
BaseHandler()~
This is the base class for all registered handlers --- and handles only the
simple mechanics of registration.
HTTPDefaultErrorHandler()~
A class which defines a default handler for HTTP error responses; all responses
are turned into HTTPError exceptions.
HTTPRedirectHandler()~
A class to handle redirections.
HTTPCookieProcessor([cookiejar])~
A class to handle HTTP Cookies.
ProxyHandler([proxies])~
Cause requests to go through a proxy. If {proxies} is given, it must be a
dictionary mapping protocol names to URLs of proxies. The default is to read
the list of proxies from the environment variables
<protocol>_proxy. If no proxy environment variables are set, in a
Windows environment, proxy settings are obtained from the registry's
Internet Settings section and in a Mac OS X environment, proxy information
is retrieved from the OS X System Configuration Framework.
To disable autodetected proxy pass an empty dictionary.
HTTPPasswordMgr()~
Keep a database of ``(realm, uri) -> (user, password)`` mappings.
HTTPPasswordMgrWithDefaultRealm()~
Keep a database of ``(realm, uri) -> (user, password)`` mappings. A realm of
``None`` is considered a catch-all realm, which is searched if no other realm
fits.
AbstractBasicAuthHandler([password_mgr])~
This is a mixin class that helps with HTTP authentication, both to the remote
host and to a proxy. {password_mgr}, if given, should be something that is
compatible with HTTPPasswordMgr; refer to section
http-password-mgr for information on the interface that must be
supported.
HTTPBasicAuthHandler([password_mgr])~
Handle authentication with the remote host. {password_mgr}, if given, should be
something that is compatible with HTTPPasswordMgr; refer to section
http-password-mgr for information on the interface that must be
supported.
ProxyBasicAuthHandler([password_mgr])~
Handle authentication with the proxy. {password_mgr}, if given, should be
something that is compatible with HTTPPasswordMgr; refer to section
http-password-mgr for information on the interface that must be
supported.
AbstractDigestAuthHandler([password_mgr])~
This is a mixin class that helps with HTTP authentication, both to the remote
host and to a proxy. {password_mgr}, if given, should be something that is
compatible with HTTPPasswordMgr; refer to section
http-password-mgr for information on the interface that must be
supported.
HTTPDigestAuthHandler([password_mgr])~
Handle authentication with the remote host. {password_mgr}, if given, should be
something that is compatible with HTTPPasswordMgr; refer to section
http-password-mgr for information on the interface that must be
supported.
ProxyDigestAuthHandler([password_mgr])~
Handle authentication with the proxy. {password_mgr}, if given, should be
something that is compatible with HTTPPasswordMgr; refer to section
http-password-mgr for information on the interface that must be
supported.
HTTPHandler()~
A class to handle opening of HTTP URLs.
HTTPSHandler()~
A class to handle opening of HTTPS URLs.
FileHandler()~
Open local files.
FTPHandler()~
Open FTP URLs.
CacheFTPHandler()~
Open FTP URLs, keeping a cache of open FTP connections to minimize delays.
UnknownHandler()~
A catch-all class to handle unknown URLs.
Request Objects
---------------
The following methods describe all of Request's public interface, and
so all must be overridden in subclasses.
Request.add_data(data)~
Set the Request data to {data}. This is ignored by all handlers except
HTTP handlers --- and there it should be a byte string, and will change the
request to be ``POST`` rather than ``GET``.
Request.get_method()~
Return a string indicating the HTTP request method. This is only meaningful for
HTTP requests, and currently always returns ``'GET'`` or ``'POST'``.
Request.has_data()~
Return whether the instance has a non-\ ``None`` data.
Request.get_data()~
Return the instance's data.
Request.add_header(key, val)~
Add another header to the request. Headers are currently ignored by all
handlers except HTTP handlers, where they are added to the list of headers sent
to the server. Note that there cannot be more than one header with the same
name, and later calls will overwrite previous calls in case the {key} collides.
Currently, this is no loss of HTTP functionality, since all headers which have
meaning when used more than once have a (header-specific) way of gaining the
same functionality using only one header.
Request.add_unredirected_header(key, header)~
Add a header that will not be added to a redirected request.
.. versionadded:: 2.4
Request.has_header(header)~
Return whether the instance has the named header (checks both regular and
unredirected).
.. versionadded:: 2.4
Request.get_full_url()~
Return the URL given in the constructor.
Request.get_type()~
Return the type of the URL --- also known as the scheme.
Request.get_host()~
Return the host to which a connection will be made.
Request.get_selector()~
Return the selector --- the part of the URL that is sent to the server.
Request.set_proxy(host, type)~
Prepare the request by connecting to a proxy server. The {host} and {type} will
replace those of the instance, and the instance's selector will be the original
URL given in the constructor.
Request.get_origin_req_host()~
Return the request-host of the origin transaction, as defined by 2965.
See the documentation for the Request constructor.
Request.is_unverifiable()~
Return whether the request is unverifiable, as defined by RFC 2965. See the
documentation for the Request constructor.
OpenerDirector Objects
----------------------
OpenerDirector instances have the following methods:
OpenerDirector.add_handler(handler)~
{handler} should be an instance of BaseHandler. The following
methods are searched, and added to the possible chains (note that HTTP errors
are a special case).
* {protocol}_open --- signal that the handler knows how to open
{protocol} URLs.
* http_error_{type} --- signal that the handler knows how to handle
HTTP errors with HTTP error code {type}.
* {protocol}_error --- signal that the handler knows how to handle
errors from (non-\ ``http``) {protocol}.
* {protocol}_request --- signal that the handler knows how to
pre-process {protocol} requests.
* {protocol}_response --- signal that the handler knows how to
post-process {protocol} responses.
OpenerDirector.open(url[, data][, timeout])~
Open the given {url} (which can be a request object or a string), optionally
passing the given {data}. Arguments, return values and exceptions raised are
the same as those of urlopen (which simply calls the open
method on the currently installed global OpenerDirector). The
optional {timeout} parameter specifies a timeout in seconds for blocking
operations like the connection attempt (if not specified, the global default
timeout setting will be used). The timeout feature actually works only for
HTTP, HTTPS, FTP and FTPS connections).
.. versionchanged:: 2.6
{timeout} was added.
OpenerDirector.error(proto[, arg[, ...]])~
Handle an error of the given protocol. This will call the registered error
handlers for the given protocol with the given arguments (which are protocol
specific). The HTTP protocol is a special case which uses the HTTP response
code to determine the specific error handler; refer to the http_error_\*
methods of the handler classes.
Return values and exceptions raised are the same as those of urlopen.
OpenerDirector objects open URLs in three stages:
The order in which these methods are called within each stage is determined by
sorting the handler instances.
#. Every handler with a method named like {protocol}_request has that
method called to pre-process the request.
#. Handlers with a method named like {protocol}_open are called to handle
the request. This stage ends when a handler either returns a non-\ None
value (ie. a response), or raises an exception (usually URLError).
Exceptions are allowed to propagate.
In fact, the above algorithm is first tried for methods named
default_open. If all such methods return None, the
algorithm is repeated for methods named like {protocol}_open. If all
such methods return None, the algorithm is repeated for methods
named unknown_open.
Note that the implementation of these methods may involve calls of the parent
OpenerDirector instance's .open and .error methods.
#. Every handler with a method named like {protocol}_response has that
method called to post-process the response.
BaseHandler Objects
-------------------
BaseHandler objects provide a couple of methods that are directly
useful, and others that are meant to be used by derived classes. These are
intended for direct use:
BaseHandler.add_parent(director)~
Add a director as parent.
BaseHandler.close()~
Remove any parents.
The following members and methods should only be used by classes derived from
BaseHandler.
.. note::
The convention has been adopted that subclasses defining
protocol_request or protocol_response methods are named
\{Processor; all others are named \}Handler.
BaseHandler.parent~
A valid OpenerDirector, which can be used to open using a different
protocol, or handle errors.
BaseHandler.default_open(req)~
This method is {not} defined in BaseHandler, but subclasses should
define it if they want to catch all URLs.
This method, if implemented, will be called by the parent
OpenerDirector. It should return a file-like object as described in
the return value of the open of OpenerDirector, or ``None``.
It should raise URLError, unless a truly exceptional thing happens (for
example, MemoryError should not be mapped to URLError).
This method will be called before any protocol-specific open method.
BaseHandler.protocol_open(req)~
("protocol" is to be replaced by the protocol name.)
This method is {not} defined in BaseHandler, but subclasses should
define it if they want to handle URLs with the given {protocol}.
This method, if defined, will be called by the parent OpenerDirector.
Return values should be the same as for default_open.
BaseHandler.unknown_open(req)~
This method is {not} defined in BaseHandler, but subclasses should
define it if they want to catch all URLs with no specific registered handler to
open it.
This method, if implemented, will be called by the parent
OpenerDirector. Return values should be the same as for
default_open.
BaseHandler.http_error_default(req, fp, code, msg, hdrs)~
This method is {not} defined in BaseHandler, but subclasses should
override it if they intend to provide a catch-all for otherwise unhandled HTTP
errors. It will be called automatically by the OpenerDirector getting
the error, and should not normally be called in other circumstances.
{req} will be a Request object, {fp} will be a file-like object with
the HTTP error body, {code} will be the three-digit code of the error, {msg}
will be the user-visible explanation of the code and {hdrs} will be a mapping
object with the headers of the error.
Return values and exceptions raised should be the same as those of
urlopen.
BaseHandler.http_error_nnn(req, fp, code, msg, hdrs)~
{nnn} should be a three-digit HTTP error code. This method is also not defined
in BaseHandler, but will be called, if it exists, on an instance of a
subclass, when an HTTP error with code {nnn} occurs.
Subclasses should override this method to handle specific HTTP errors.
Arguments, return values and exceptions raised should be the same as for
http_error_default.
BaseHandler.protocol_request(req)~
("protocol" is to be replaced by the protocol name.)
This method is {not} defined in BaseHandler, but subclasses should
define it if they want to pre-process requests of the given {protocol}.
This method, if defined, will be called by the parent OpenerDirector.
{req} will be a Request object. The return value should be a
Request object.
BaseHandler.protocol_response(req, response)~
("protocol" is to be replaced by the protocol name.)
This method is {not} defined in BaseHandler, but subclasses should
define it if they want to post-process responses of the given {protocol}.
This method, if defined, will be called by the parent OpenerDirector.
{req} will be a Request object. {response} will be an object
implementing the same interface as the return value of urlopen. The
return value should implement the same interface as the return value of
urlopen.
HTTPRedirectHandler Objects
---------------------------
.. note::
Some HTTP redirections require action from this module's client code. If this
is the case, HTTPError is raised. See 2616 for details of the
precise meanings of the various redirection codes.
HTTPRedirectHandler.redirect_request(req, fp, code, msg, hdrs, newurl)~
Return a Request or ``None`` in response to a redirect. This is called
by the default implementations of the http_error_30\* methods when a
redirection is received from the server. If a redirection should take place,
return a new Request to allow http_error_30\* to perform the
redirect to {newurl}. Otherwise, raise HTTPError if no other handler
should try to handle this URL, or return ``None`` if you can't but another
handler might.
.. note:: >
The default implementation of this method does not strictly follow 2616,
which says that 301 and 302 responses to ``POST`` requests must not be
automatically redirected without confirmation by the user. In reality, browsers
do allow automatic redirection of these responses, changing the POST to a
``GET``, and the default implementation reproduces this behavior.
<
HTTPRedirectHandler.http_error_301(req, fp, code, msg, hdrs)~
Redirect to the ``Location:`` or ``URI:`` URL. This method is called by the
parent OpenerDirector when getting an HTTP 'moved permanently' response.
HTTPRedirectHandler.http_error_302(req, fp, code, msg, hdrs)~
The same as http_error_301, but called for the 'found' response.
HTTPRedirectHandler.http_error_303(req, fp, code, msg, hdrs)~
The same as http_error_301, but called for the 'see other' response.
HTTPRedirectHandler.http_error_307(req, fp, code, msg, hdrs)~
The same as http_error_301, but called for the 'temporary redirect'
response.
HTTPCookieProcessor Objects
---------------------------
.. versionadded:: 2.4
HTTPCookieProcessor instances have one attribute:
HTTPCookieProcessor.cookiejar~
The cookielib.CookieJar in which cookies are stored.
ProxyHandler Objects
--------------------
ProxyHandler.protocol_open(request)~
("protocol" is to be replaced by the protocol name.)
The ProxyHandler will have a method {protocol}_open for every
{protocol} which has a proxy in the {proxies} dictionary given in the
constructor. The method will modify requests to go through the proxy, by
calling ``request.set_proxy()``, and call the next handler in the chain to
actually execute the protocol.
HTTPPasswordMgr Objects
-----------------------
These methods are available on HTTPPasswordMgr and
HTTPPasswordMgrWithDefaultRealm objects.
HTTPPasswordMgr.add_password(realm, uri, user, passwd)~
{uri} can be either a single URI, or a sequence of URIs. {realm}, {user} and
{passwd} must be strings. This causes ``(user, passwd)`` to be used as
authentication tokens when authentication for {realm} and a super-URI of any of
the given URIs is given.
HTTPPasswordMgr.find_user_password(realm, authuri)~
Get user/password for given realm and URI, if any. This method will return
``(None, None)`` if there is no matching user/password.
For HTTPPasswordMgrWithDefaultRealm objects, the realm ``None`` will be
searched if the given {realm} has no matching user/password.
AbstractBasicAuthHandler Objects
--------------------------------
AbstractBasicAuthHandler.http_error_auth_reqed(authreq, host, req, headers)~
Handle an authentication request by getting a user/password pair, and re-trying
the request. {authreq} should be the name of the header where the information
about the realm is included in the request, {host} specifies the URL and path to
authenticate for, {req} should be the (failed) Request object, and
{headers} should be the error headers.
{host} is either an authority (e.g. ``"python.org"``) or a URL containing an
authority component (e.g. ``"http://python.org/"``). In either case, the
authority must not contain a userinfo component (so, ``"python.org"`` and
``"python.org:80"`` are fine, ``"joe:password@python.org"`` is not).
HTTPBasicAuthHandler Objects
----------------------------
HTTPBasicAuthHandler.http_error_401(req, fp, code, msg, hdrs)~
Retry the request with authentication information, if available.
ProxyBasicAuthHandler Objects
-----------------------------
ProxyBasicAuthHandler.http_error_407(req, fp, code, msg, hdrs)~
Retry the request with authentication information, if available.
AbstractDigestAuthHandler Objects
---------------------------------
AbstractDigestAuthHandler.http_error_auth_reqed(authreq, host, req, headers)~
{authreq} should be the name of the header where the information about the realm
is included in the request, {host} should be the host to authenticate to, {req}
should be the (failed) Request object, and {headers} should be the
error headers.
HTTPDigestAuthHandler Objects
-----------------------------
HTTPDigestAuthHandler.http_error_401(req, fp, code, msg, hdrs)~
Retry the request with authentication information, if available.
ProxyDigestAuthHandler Objects
------------------------------
ProxyDigestAuthHandler.http_error_407(req, fp, code, msg, hdrs)~
Retry the request with authentication information, if available.
HTTPHandler Objects
-------------------
HTTPHandler.http_open(req)~
Send an HTTP request, which can be either GET or POST, depending on
``req.has_data()``.
HTTPSHandler Objects
--------------------
HTTPSHandler.https_open(req)~
Send an HTTPS request, which can be either GET or POST, depending on
``req.has_data()``.
FileHandler Objects
-------------------
FileHandler.file_open(req)~
Open the file locally, if there is no host name, or the host name is
``'localhost'``. Change the protocol to ``ftp`` otherwise, and retry opening it
using parent.
FTPHandler Objects
------------------
FTPHandler.ftp_open(req)~
Open the FTP file indicated by {req}. The login is always done with empty
username and password.
CacheFTPHandler Objects
-----------------------
CacheFTPHandler objects are FTPHandler objects with the
following additional methods:
CacheFTPHandler.setTimeout(t)~
Set timeout of connections to {t} seconds.
CacheFTPHandler.setMaxConns(m)~
Set maximum number of cached connections to {m}.
UnknownHandler Objects
----------------------
UnknownHandler.unknown_open()~
Raise a URLError exception.
HTTPErrorProcessor Objects
--------------------------
.. versionadded:: 2.4
HTTPErrorProcessor.unknown_open()~
Process HTTP error responses.
For 200 error codes, the response object is returned immediately.
For non-200 error codes, this simply passes the job on to the
{protocol}_error_code handler methods, via
OpenerDirector.error. Eventually,
urllib2.HTTPDefaultErrorHandler will raise an HTTPError if no
other handler handles the error.
Examples
--------
This example gets the python.org main page and displays the first 100 bytes of
it:: >
>>> import urllib2
>>> f = urllib2.urlopen('http://www.python.org/')
>>> print f.read(100)
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<?xml-stylesheet href="./css/ht2html
<
Here we are sending a data-stream to the stdin of a CGI and reading the data it
returns to us. Note that this example will only work when the Python
installation supports SSL. :: >
>>> import urllib2
>>> req = urllib2.Request(url='https://localhost/cgi-bin/test.cgi',
... data='This data is passed to stdin of the CGI')
>>> f = urllib2.urlopen(req)
>>> print f.read()
Got Data: "This data is passed to stdin of the CGI"
<
The code for the sample CGI used in the above example is::
#!/usr/bin/env python
import sys
data = sys.stdin.read()
print 'Content-type: text-plain\n\nGot Data: "%s"' % data
Use of Basic HTTP Authentication:: >
import urllib2
# Create an OpenerDirector with support for Basic HTTP Authentication...
auth_handler = urllib2.HTTPBasicAuthHandler()
auth_handler.add_password(realm='PDQ Application',
uri='https://mahler:8092/site-updates.py',
user='klem',
passwd='kadidd!ehopper')
opener = urllib2.build_opener(auth_handler)
# ...and install it globally so it can be used with urlopen.
urllib2.install_opener(opener)
urllib2.urlopen('http://www.example.com/login.html')
<
build_opener provides many handlers by default, including a
ProxyHandler. By default, ProxyHandler uses the environment
variables named ``<scheme>_proxy``, where ``<scheme>`` is the URL scheme
involved. For example, the http_proxy environment variable is read to
obtain the HTTP proxy's URL.
This example replaces the default ProxyHandler with one that uses
programmatically-supplied proxy URLs, and adds proxy authorization support with
ProxyBasicAuthHandler. :: >
proxy_handler = urllib2.ProxyHandler({'http': 'http://www.example.com:3128/'})
proxy_auth_handler = urllib2.ProxyBasicAuthHandler()
proxy_auth_handler.add_password('realm', 'host', 'username', 'password')
opener = urllib2.build_opener(proxy_handler, proxy_auth_handler)
# This time, rather than install the OpenerDirector, we use it directly:
opener.open('http://www.example.com/login.html')
<
Adding HTTP headers:
Use the {headers} argument to the Request constructor, or:: >
import urllib2
req = urllib2.Request('http://www.example.com/')
req.add_header('Referer', 'http://www.python.org/')
r = urllib2.urlopen(req)
<
OpenerDirector automatically adds a User-Agent header to
every Request. To change this:: >
import urllib2
opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
opener.open('http://www.example.com/')
<
Also, remember that a few standard headers (Content-Length,
Content-Type and Host) are added when the
Request is passed to urlopen (or OpenerDirector.open).
==============================================================================
*py2stdlib-urlparse*
urlparse~
:synopsis: Parse URLs into or assemble them from components.
.. index::
single: WWW
single: World Wide Web
single: URL
pair: URL; parsing
pair: relative; URL
.. note::
The urlparse (|py2stdlib-urlparse|) module is renamed to urllib.parse in Python 3.0.
The 2to3 tool will automatically adapt imports when converting
your sources to 3.0.
This module defines a standard interface to break Uniform Resource Locator (URL)
strings up in components (addressing scheme, network location, path etc.), to
combine the components back into a URL string, and to convert a "relative URL"
to an absolute URL given a "base URL."
The module has been designed to match the Internet RFC on Relative Uniform
Resource Locators (and discovered a bug in an earlier draft!). It supports the
following URL schemes: ``file``, ``ftp``, ``gopher``, ``hdl``, ``http``,
``https``, ``imap``, ``mailto``, ``mms``, ``news``, ``nntp``, ``prospero``,
``rsync``, ``rtsp``, ``rtspu``, ``sftp``, ``shttp``, ``sip``, ``sips``,
``snews``, ``svn``, ``svn+ssh``, ``telnet``, ``wais``.
.. versionadded:: 2.5
Support for the ``sftp`` and ``sips`` schemes.
The urlparse (|py2stdlib-urlparse|) module defines the following functions:
urlparse(urlstring[, scheme[, allow_fragments]])~
Parse a URL into six components, returning a 6-tuple. This corresponds to the
general structure of a URL: ``scheme://netloc/path;parameters?query#fragment``.
Each tuple item is a string, possibly empty. The components are not broken up in
smaller parts (for example, the network location is a single string), and %
escapes are not expanded. The delimiters as shown above are not part of the
result, except for a leading slash in the {path} component, which is retained if
present. For example:
>>> from urlparse import urlparse
>>> o = urlparse('http://www.cwi.nl:80/%7Eguido/Python.html')
>>> o # doctest: +NORMALIZE_WHITESPACE
ParseResult(scheme='http', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html',
params='', query='', fragment='')
>>> o.scheme
'http'
>>> o.port
80
>>> o.geturl()
'http://www.cwi.nl:80/%7Eguido/Python.html'
If the {scheme} argument is specified, it gives the default addressing
scheme, to be used only if the URL does not specify one. The default value for
this argument is the empty string.
If the {allow_fragments} argument is false, fragment identifiers are not
allowed, even if the URL's addressing scheme normally does support them. The
default value for this argument is True.
The return value is actually an instance of a subclass of tuple. This
class has the following additional read-only convenience attributes:
+------------------+-------+--------------------------+----------------------+
| Attribute | Index | Value | Value if not present |
+==================+=======+==========================+======================+
| scheme | 0 | URL scheme specifier | empty string |
+------------------+-------+--------------------------+----------------------+
| netloc | 1 | Network location part | empty string |
+------------------+-------+--------------------------+----------------------+
| path | 2 | Hierarchical path | empty string |
+------------------+-------+--------------------------+----------------------+
| params | 3 | Parameters for last path | empty string |
| | | element | |
+------------------+-------+--------------------------+----------------------+
| query | 4 | Query component | empty string |
+------------------+-------+--------------------------+----------------------+
| fragment | 5 | Fragment identifier | empty string |
+------------------+-------+--------------------------+----------------------+
| username | | User name | None |
+------------------+-------+--------------------------+----------------------+
| password | | Password | None |
+------------------+-------+--------------------------+----------------------+
| hostname | | Host name (lower case) | None |
+------------------+-------+--------------------------+----------------------+
| port | | Port number as integer, | None |
| | | if present | |
+------------------+-------+--------------------------+----------------------+
See section urlparse-result-object for more information on the result
object.
.. versionchanged:: 2.5
Added attributes to return value.
.. versionchanged:: 2.7
Added IPv6 URL parsing capabilities.
parse_qs(qs[, keep_blank_values[, strict_parsing]])~
Parse a query string given as a string argument (data of type
application/x-www-form-urlencoded). Data are returned as a
dictionary. The dictionary keys are the unique query variable names and the
values are lists of values for each name.
The optional argument {keep_blank_values} is a flag indicating whether blank
values in URL encoded queries should be treated as blank strings. A true value
indicates that blanks should be retained as blank strings. The default false
value indicates that blank values are to be ignored and treated as if they were
not included.
The optional argument {strict_parsing} is a flag indicating what to do with
parsing errors. If false (the default), errors are silently ignored. If true,
errors raise a ValueError exception.
Use the urllib.urlencode function to convert such dictionaries into
query strings.
.. versionadded:: 2.6
Copied from the cgi (|py2stdlib-cgi|) module.
parse_qsl(qs[, keep_blank_values[, strict_parsing]])~
Parse a query string given as a string argument (data of type
application/x-www-form-urlencoded). Data are returned as a list of
name, value pairs.
The optional argument {keep_blank_values} is a flag indicating whether blank
values in URL encoded queries should be treated as blank strings. A true value
indicates that blanks should be retained as blank strings. The default false
value indicates that blank values are to be ignored and treated as if they were
not included.
The optional argument {strict_parsing} is a flag indicating what to do with
parsing errors. If false (the default), errors are silently ignored. If true,
errors raise a ValueError exception.
Use the urllib.urlencode function to convert such lists of pairs into
query strings.
.. versionadded:: 2.6
Copied from the cgi (|py2stdlib-cgi|) module.
urlunparse(parts)~
Construct a URL from a tuple as returned by ``urlparse()``. The {parts} argument
can be any six-item iterable. This may result in a slightly different, but
equivalent URL, if the URL that was parsed originally had unnecessary delimiters
(for example, a ? with an empty query; the RFC states that these are
equivalent).
urlsplit(urlstring[, scheme[, allow_fragments]])~
This is similar to urlparse (|py2stdlib-urlparse|), but does not split the params from the URL.
This should generally be used instead of urlparse (|py2stdlib-urlparse|) if the more recent URL
syntax allowing parameters to be applied to each segment of the {path} portion
of the URL (see 2396) is wanted. A separate function is needed to
separate the path segments and parameters. This function returns a 5-tuple:
(addressing scheme, network location, path, query, fragment identifier).
The return value is actually an instance of a subclass of tuple. This
class has the following additional read-only convenience attributes:
+------------------+-------+-------------------------+----------------------+
| Attribute | Index | Value | Value if not present |
+==================+=======+=========================+======================+
| scheme | 0 | URL scheme specifier | empty string |
+------------------+-------+-------------------------+----------------------+
| netloc | 1 | Network location part | empty string |
+------------------+-------+-------------------------+----------------------+
| path | 2 | Hierarchical path | empty string |
+------------------+-------+-------------------------+----------------------+
| query | 3 | Query component | empty string |
+------------------+-------+-------------------------+----------------------+
| fragment | 4 | Fragment identifier | empty string |
+------------------+-------+-------------------------+----------------------+
| username | | User name | None |
+------------------+-------+-------------------------+----------------------+
| password | | Password | None |
+------------------+-------+-------------------------+----------------------+
| hostname | | Host name (lower case) | None |
+------------------+-------+-------------------------+----------------------+
| port | | Port number as integer, | None |
| | | if present | |
+------------------+-------+-------------------------+----------------------+
See section urlparse-result-object for more information on the result
object.
.. versionadded:: 2.2
.. versionchanged:: 2.5
Added attributes to return value.
urlunsplit(parts)~
Combine the elements of a tuple as returned by urlsplit into a complete
URL as a string. The {parts} argument can be any five-item iterable. This may
result in a slightly different, but equivalent URL, if the URL that was parsed
originally had unnecessary delimiters (for example, a ? with an empty query; the
RFC states that these are equivalent).
.. versionadded:: 2.2
urljoin(base, url[, allow_fragments])~
Construct a full ("absolute") URL by combining a "base URL" ({base}) with
another URL ({url}). Informally, this uses components of the base URL, in
particular the addressing scheme, the network location and (part of) the path,
to provide missing components in the relative URL. For example:
>>> from urlparse import urljoin
>>> urljoin('http://www.cwi.nl/%7Eguido/Python.html', 'FAQ.html')
'http://www.cwi.nl/%7Eguido/FAQ.html'
The {allow_fragments} argument has the same meaning and default as for
urlparse (|py2stdlib-urlparse|).
.. note:: >
If {url} is an absolute URL (that is, starting with ``//`` or ``scheme://``),
the {url}'s host name and/or scheme will be present in the result. For example:
<
.. doctest::
>>> urljoin('http://www.cwi.nl/%7Eguido/Python.html',
... '//www.python.org/%7Eguido')
'http://www.python.org/%7Eguido'
If you do not want that behavior, preprocess the {url} with urlsplit and
urlunsplit, removing possible {scheme} and {netloc} parts.
urldefrag(url)~
If {url} contains a fragment identifier, returns a modified version of {url}
with no fragment identifier, and the fragment identifier as a separate string.
If there is no fragment identifier in {url}, returns {url} unmodified and an
empty string.
.. seealso::
3986 - Uniform Resource Identifiers
This is the current standard (STD66). Any changes to urlparse module
should conform to this. Certain deviations could be observed, which are
mostly due backward compatiblity purposes and for certain de-facto
parsing requirements as commonly observed in major browsers.
2732 - Format for Literal IPv6 Addresses in URL's.
This specifies the parsing requirements of IPv6 URLs.
2396 - Uniform Resource Identifiers (URI): Generic Syntax
Document describing the generic syntactic requirements for both Uniform Resource
Names (URNs) and Uniform Resource Locators (URLs).
2368 - The mailto URL scheme.
Parsing requirements for mailto url schemes.
1808 - Relative Uniform Resource Locators
This Request For Comments includes the rules for joining an absolute and a
relative URL, including a fair number of "Abnormal Examples" which govern the
treatment of border cases.
1738 - Uniform Resource Locators (URL)
This specifies the formal syntax and semantics of absolute URLs.
Results of urlparse (|py2stdlib-urlparse|) and urlsplit
------------------------------------------------
The result objects from the urlparse (|py2stdlib-urlparse|) and urlsplit functions are
subclasses of the tuple type. These subclasses add the attributes
described in those functions, as well as provide an additional method:
ParseResult.geturl()~
Return the re-combined version of the original URL as a string. This may differ
from the original URL in that the scheme will always be normalized to lower case
and empty components may be dropped. Specifically, empty parameters, queries,
and fragment identifiers will be removed.
The result of this method is a fixpoint if passed back through the original
parsing function:
>>> import urlparse
>>> url = 'HTTP://www.Python.org/doc/#'
>>> r1 = urlparse.urlsplit(url)
>>> r1.geturl()
'http://www.Python.org/doc/'
>>> r2 = urlparse.urlsplit(r1.geturl())
>>> r2.geturl()
'http://www.Python.org/doc/'
.. versionadded:: 2.5
The following classes provide the implementations of the parse results:
BaseResult~
Base class for the concrete result classes. This provides most of the attribute
definitions. It does not provide a geturl method. It is derived from
tuple, but does not override the __init__ or __new__
methods.
ParseResult(scheme, netloc, path, params, query, fragment)~
Concrete class for urlparse (|py2stdlib-urlparse|) results. The __new__ method is
overridden to support checking that the right number of arguments are passed.
SplitResult(scheme, netloc, path, query, fragment)~
Concrete class for urlsplit results. The __new__ method is
overridden to support checking that the right number of arguments are passed.
==============================================================================
*py2stdlib-user*
user~
:synopsis: A standard way to reference user-specific modules.
:deprecated:
2.6~
The user (|py2stdlib-user|) module has been removed in Python 3.0.
.. index::
pair: .pythonrc.py; file
triple: user; configuration; file
As a policy, Python doesn't run user-specified code on startup of Python
programs. (Only interactive sessions execute the script specified in the
PYTHONSTARTUP environment variable if it exists).
However, some programs or sites may find it convenient to allow users to have a
standard customization file, which gets run when a program requests it. This
module implements such a mechanism. A program that wishes to use the mechanism
must execute the statement :: >
import user
<
.. index:: builtin: execfile
The user (|py2stdlib-user|) module looks for a file .pythonrc.py in the user's home
directory and if it can be opened, executes it (using execfile) in its
own (the module user (|py2stdlib-user|)'s) global namespace. Errors during this phase are
not caught; that's up to the program that imports the user (|py2stdlib-user|) module, if it
wishes. The home directory is assumed to be named by the HOME
environment variable; if this is not set, the current directory is used.
The user's .pythonrc.py could conceivably test for ``sys.version`` if it
wishes to do different things depending on the Python version.
A warning to users: be very conservative in what you place in your
.pythonrc.py file. Since you don't know which programs will use it,
changing the behavior of standard modules or functions is generally not a good
idea.
A suggestion for programmers who wish to use this mechanism: a simple way to let
users specify options for your package is to have them define variables in their
.pythonrc.py file that you test in your module. For example, a module
spam that has a verbosity level can look for a variable
``user.spam_verbose``, as follows:: >
import user
verbose = bool(getattr(user, "spam_verbose", 0))
<
(The three-argument form of getattr is used in case the user has not
defined ``spam_verbose`` in their .pythonrc.py file.)
Programs with extensive customization needs are better off reading a
program-specific customization file.
Programs with security or privacy concerns should {not} import this module; a
user can easily break into a program by placing arbitrary code in the
.pythonrc.py file.
Modules for general use should {not} import this module; it may interfere with
the operation of the importing program.
.. seealso::
Module site (|py2stdlib-site|)
Site-wide customization mechanism.
==============================================================================
*py2stdlib-userdict*
UserDict~
:synopsis: Class wrapper for dictionary objects.
The module defines a mixin, DictMixin, defining all dictionary methods
for classes that already have a minimum mapping interface. This greatly
simplifies writing classes that need to be substitutable for dictionaries (such
as the shelve module).
This module also defines a class, UserDict (|py2stdlib-userdict|), that acts as a wrapper
around dictionary objects. The need for this class has been largely supplanted
by the ability to subclass directly from dict (a feature that became
available starting with Python version 2.2). Prior to the introduction of
dict, the UserDict (|py2stdlib-userdict|) class was used to create dictionary-like
sub-classes that obtained new behaviors by overriding existing methods or adding
new ones.
The UserDict (|py2stdlib-userdict|) module defines the UserDict (|py2stdlib-userdict|) class and
DictMixin:
UserDict([initialdata])~
Class that simulates a dictionary. The instance's contents are kept in a
regular dictionary, which is accessible via the data attribute of
UserDict (|py2stdlib-userdict|) instances. If {initialdata} is provided, data is
initialized with its contents; note that a reference to {initialdata} will not
be kept, allowing it be used for other purposes.
.. note:: >
For backward compatibility, instances of UserDict (|py2stdlib-userdict|) are not iterable.
<
IterableUserDict([initialdata])~
Subclass of UserDict (|py2stdlib-userdict|) that supports direct iteration (e.g. ``for key in
myDict``).
In addition to supporting the methods and operations of mappings (see section
typesmapping), UserDict (|py2stdlib-userdict|) and IterableUserDict instances
provide the following attribute:
IterableUserDict.data~
A real dictionary used to store the contents of the UserDict (|py2stdlib-userdict|) class.
DictMixin()~
Mixin defining all dictionary methods for classes that already have a minimum
dictionary interface including __getitem__, __setitem__,
__delitem__, and keys.
This mixin should be used as a superclass. Adding each of the above methods
adds progressively more functionality. For instance, defining all but
__delitem__ will preclude only pop and popitem from the
full interface.
In addition to the four base methods, progressively more efficiency comes with
defining __contains__, __iter__, and iteritems.
Since the mixin has no knowledge of the subclass constructor, it does not define
__init__ or copy (|py2stdlib-copy|).
Starting with Python version 2.6, it is recommended to use
collections.MutableMapping instead of DictMixin.
UserList (|py2stdlib-userlist|) --- Class wrapper for list objects
==================================================
==============================================================================
*py2stdlib-userlist*
UserList~
:synopsis: Class wrapper for list objects.
.. note::
This module is available for backward compatibility only. If you are writing
code that does not need to work with versions of Python earlier than Python 2.2,
please consider subclassing directly from the built-in list type.
This module defines a class that acts as a wrapper around list objects. It is a
useful base class for your own list-like classes, which can inherit from them
and override existing methods or add new ones. In this way one can add new
behaviors to lists.
The UserList (|py2stdlib-userlist|) module defines the UserList (|py2stdlib-userlist|) class:
UserList([list])~
Class that simulates a list. The instance's contents are kept in a regular
list, which is accessible via the data attribute of UserList (|py2stdlib-userlist|)
instances. The instance's contents are initially set to a copy of {list},
defaulting to the empty list ``[]``. {list} can be any iterable, e.g. a
real Python list or a UserList (|py2stdlib-userlist|) object.
.. note::
The UserList (|py2stdlib-userlist|) class has been moved to the collections (|py2stdlib-collections|)
module in Python 3.0. The 2to3 tool will automatically adapt
imports when converting your sources to 3.0.
In addition to supporting the methods and operations of mutable sequences (see
section typesseq), UserList (|py2stdlib-userlist|) instances provide the following
attribute:
UserList.data~
A real Python list object used to store the contents of the UserList (|py2stdlib-userlist|)
class.
{Subclassing requirements:}* Subclasses of UserList (|py2stdlib-userlist|) are expect to
offer a constructor which can be called with either no arguments or one
argument. List operations which return a new sequence attempt to create an
instance of the actual implementation class. To do so, it assumes that the
constructor can be called with a single parameter, which is a sequence object
used as a data source.
If a derived class does not wish to comply with this requirement, all of the
special methods supported by this class will need to be overridden; please
consult the sources for information about the methods which need to be provided
in that case.
.. versionchanged:: 2.0
Python versions 1.5.2 and 1.6 also required that the constructor be callable
with no parameters, and offer a mutable data attribute. Earlier
versions of Python did not attempt to create instances of the derived class.
UserString (|py2stdlib-userstring|) --- Class wrapper for string objects
======================================================
==============================================================================
*py2stdlib-userstring*
UserString~
:synopsis: Class wrapper for string objects.
.. note::
This UserString (|py2stdlib-userstring|) class from this module is available for backward
compatibility only. If you are writing code that does not need to work with
versions of Python earlier than Python 2.2, please consider subclassing directly
from the built-in str type instead of using UserString (|py2stdlib-userstring|) (there
is no built-in equivalent to MutableString).
This module defines a class that acts as a wrapper around string objects. It is
a useful base class for your own string-like classes, which can inherit from
them and override existing methods or add new ones. In this way one can add new
behaviors to strings.
It should be noted that these classes are highly inefficient compared to real
string or Unicode objects; this is especially the case for
MutableString.
The UserString (|py2stdlib-userstring|) module defines the following classes:
UserString([sequence])~
Class that simulates a string or a Unicode string object. The instance's
content is kept in a regular string or Unicode string object, which is
accessible via the data attribute of UserString (|py2stdlib-userstring|) instances. The
instance's contents are initially set to a copy of {sequence}. {sequence} can
be either a regular Python string or Unicode string, an instance of
UserString (|py2stdlib-userstring|) (or a subclass) or an arbitrary sequence which can be
converted into a string using the built-in str function.
.. note::
The UserString (|py2stdlib-userstring|) class has been moved to the collections (|py2stdlib-collections|)
module in Python 3.0. The 2to3 tool will automatically adapt
imports when converting your sources to 3.0.
MutableString([sequence])~
This class is derived from the UserString (|py2stdlib-userstring|) above and redefines strings
to be {mutable}. Mutable strings can't be used as dictionary keys, because
dictionaries require {immutable} objects as keys. The main intention of this
class is to serve as an educational example for inheritance and necessity to
remove (override) the __hash__ method in order to trap attempts to use a
mutable object as dictionary key, which would be otherwise very error prone and
hard to track down.
2.6~
The MutableString class has been removed in Python 3.0.
In addition to supporting the methods and operations of string and Unicode
objects (see section string-methods), UserString (|py2stdlib-userstring|) instances
provide the following attribute:
MutableString.data~
A real Python string or Unicode object used to store the content of the
UserString (|py2stdlib-userstring|) class.
==============================================================================
*py2stdlib-uu*
uu~
:synopsis: Encode and decode files in uuencode format.
This module encodes and decodes files in uuencode format, allowing arbitrary
binary data to be transferred over ASCII-only connections. Wherever a file
argument is expected, the methods accept a file-like object. For backwards
compatibility, a string containing a pathname is also accepted, and the
corresponding file will be opened for reading and writing; the pathname ``'-'``
is understood to mean the standard input or output. However, this interface is
deprecated; it's better for the caller to open the file itself, and be sure
that, when required, the mode is ``'rb'`` or ``'wb'`` on Windows.
.. index::
single: Jansen, Jack
single: Ellinghouse, Lance
This code was contributed by Lance Ellinghouse, and modified by Jack Jansen.
The uu (|py2stdlib-uu|) module defines the following functions:
encode(in_file, out_file[, name[, mode]])~
Uuencode file {in_file} into file {out_file}. The uuencoded file will have the
header specifying {name} and {mode} as the defaults for the results of decoding
the file. The default defaults are taken from {in_file}, or ``'-'`` and ``0666``
respectively.
decode(in_file[, out_file[, mode[, quiet]]])~
This call decodes uuencoded file {in_file} placing the result on file
{out_file}. If {out_file} is a pathname, {mode} is used to set the permission
bits if the file must be created. Defaults for {out_file} and {mode} are taken
from the uuencode header. However, if the file specified in the header already
exists, a uu.Error is raised.
decode may print a warning to standard error if the input was produced
by an incorrect uuencoder and Python could recover from that error. Setting
{quiet} to a true value silences this warning.
Error()~
Subclass of Exception, this can be raised by uu.decode under
various situations, such as described above, but also including a badly
formatted header, or truncated input file.
.. seealso::
Module binascii (|py2stdlib-binascii|)
Support module containing ASCII-to-binary and binary-to-ASCII conversions.
==============================================================================
*py2stdlib-uuid*
uuid~
:synopsis: UUID objects (universally unique identifiers) according to RFC 4122
.. versionadded:: 2.5
This module provides immutable UUID objects (the UUID class)
and the functions uuid1, uuid3, uuid4, uuid5 for
generating version 1, 3, 4, and 5 UUIDs as specified in 4122.
If all you want is a unique ID, you should probably call uuid1 or
uuid4. Note that uuid1 may compromise privacy since it creates
a UUID containing the computer's network address. uuid4 creates a
random UUID.
UUID([hex[, bytes[, bytes_le[, fields[, int[, version]]]]]])~
Create a UUID from either a string of 32 hexadecimal digits, a string of 16
bytes as the {bytes} argument, a string of 16 bytes in little-endian order as
the {bytes_le} argument, a tuple of six integers (32-bit {time_low}, 16-bit
{time_mid}, 16-bit {time_hi_version}, 8-bit {clock_seq_hi_variant}, 8-bit
{clock_seq_low}, 48-bit {node}) as the {fields} argument, or a single 128-bit
integer as the {int} argument. When a string of hex digits is given, curly
braces, hyphens, and a URN prefix are all optional. For example, these
expressions all yield the same UUID:: >
UUID('{12345678-1234-5678-1234-567812345678}')
UUID('12345678123456781234567812345678')
UUID('urn:uuid:12345678-1234-5678-1234-567812345678')
UUID(bytes='\x12\x34\x56\x78'*4)
UUID(bytes_le='\x78\x56\x34\x12\x34\x12\x78\x56' +
'\x12\x34\x56\x78\x12\x34\x56\x78')
UUID(fields=(0x12345678, 0x1234, 0x5678, 0x12, 0x34, 0x567812345678))
UUID(int=0x12345678123456781234567812345678)
<
Exactly one of {hex}, {bytes}, {bytes_le}, {fields}, or {int} must be given.
The {version} argument is optional; if given, the resulting UUID will have its
variant and version number set according to RFC 4122, overriding bits in the
given {hex}, {bytes}, {bytes_le}, {fields}, or {int}.
UUID instances have these read-only attributes:
UUID.bytes~
The UUID as a 16-byte string (containing the six integer fields in big-endian
byte order).
UUID.bytes_le~
The UUID as a 16-byte string (with {time_low}, {time_mid}, and {time_hi_version}
in little-endian byte order).
UUID.fields~
A tuple of the six integer fields of the UUID, which are also available as six
individual attributes and two derived attributes:
+------------------------------+-------------------------------+
| Field | Meaning |
+==============================+===============================+
| time_low | the first 32 bits of the UUID |
+------------------------------+-------------------------------+
| time_mid | the next 16 bits of the UUID |
+------------------------------+-------------------------------+
| time_hi_version | the next 16 bits of the UUID |
+------------------------------+-------------------------------+
| clock_seq_hi_variant | the next 8 bits of the UUID |
+------------------------------+-------------------------------+
| clock_seq_low | the next 8 bits of the UUID |
+------------------------------+-------------------------------+
| node | the last 48 bits of the UUID |
+------------------------------+-------------------------------+
| time (|py2stdlib-time|) | the 60-bit timestamp |
+------------------------------+-------------------------------+
| clock_seq | the 14-bit sequence number |
+------------------------------+-------------------------------+
UUID.hex~
The UUID as a 32-character hexadecimal string.
UUID.int~
The UUID as a 128-bit integer.
UUID.urn~
The UUID as a URN as specified in RFC 4122.
UUID.variant~
The UUID variant, which determines the internal layout of the UUID. This will be
one of the integer constants RESERVED_NCS, RFC_4122,
RESERVED_MICROSOFT, or RESERVED_FUTURE.
UUID.version~
The UUID version number (1 through 5, meaningful only when the variant is
RFC_4122).
The uuid (|py2stdlib-uuid|) module defines the following functions:
getnode()~
Get the hardware address as a 48-bit positive integer. The first time this
runs, it may launch a separate program, which could be quite slow. If all
attempts to obtain the hardware address fail, we choose a random 48-bit number
with its eighth bit set to 1 as recommended in RFC 4122. "Hardware address"
means the MAC address of a network interface, and on a machine with multiple
network interfaces the MAC address of any one of them may be returned.
.. index:: single: getnode
uuid1([node[, clock_seq]])~
Generate a UUID from a host ID, sequence number, and the current time. If {node}
is not given, getnode is used to obtain the hardware address. If
{clock_seq} is given, it is used as the sequence number; otherwise a random
14-bit sequence number is chosen.
.. index:: single: uuid1
uuid3(namespace, name)~
Generate a UUID based on the MD5 hash of a namespace identifier (which is a
UUID) and a name (which is a string).
.. index:: single: uuid3
uuid4()~
Generate a random UUID.
.. index:: single: uuid4
uuid5(namespace, name)~
Generate a UUID based on the SHA-1 hash of a namespace identifier (which is a
UUID) and a name (which is a string).
.. index:: single: uuid5
The uuid (|py2stdlib-uuid|) module defines the following namespace identifiers for use with
uuid3 or uuid5.
NAMESPACE_DNS~
When this namespace is specified, the {name} string is a fully-qualified domain
name.
NAMESPACE_URL~
When this namespace is specified, the {name} string is a URL.
NAMESPACE_OID~
When this namespace is specified, the {name} string is an ISO OID.
NAMESPACE_X500~
When this namespace is specified, the {name} string is an X.500 DN in DER or a
text output format.
The uuid (|py2stdlib-uuid|) module defines the following constants for the possible values
of the variant attribute:
RESERVED_NCS~
Reserved for NCS compatibility.
RFC_4122~
Specifies the UUID layout given in 4122.
RESERVED_MICROSOFT~
Reserved for Microsoft compatibility.
RESERVED_FUTURE~
Reserved for future definition.
.. seealso::
4122 - A Universally Unique IDentifier (UUID) URN Namespace
This specification defines a Uniform Resource Name namespace for UUIDs, the
internal format of UUIDs, and methods of generating UUIDs.
Example
-------
Here are some examples of typical usage of the uuid (|py2stdlib-uuid|) module:: >
>>> import uuid
# make a UUID based on the host ID and current time
>>> uuid.uuid1()
UUID('a8098c1a-f86e-11da-bd1a-00112444be1e')
# make a UUID using an MD5 hash of a namespace UUID and a name
>>> uuid.uuid3(uuid.NAMESPACE_DNS, 'python.org')
UUID('6fa459ea-ee8a-3ca4-894e-db77e160355e')
# make a random UUID
>>> uuid.uuid4()
UUID('16fd2706-8baf-433b-82eb-8c7fada847da')
# make a UUID using a SHA-1 hash of a namespace UUID and a name
>>> uuid.uuid5(uuid.NAMESPACE_DNS, 'python.org')
UUID('886313e1-3b8a-5372-9b90-0c9aee199e5d')
# make a UUID from a string of hex digits (braces and hyphens ignored)
>>> x = uuid.UUID('{00010203-0405-0607-0809-0a0b0c0d0e0f}')
# convert a UUID to a string of hex digits in standard form
>>> str(x)
'00010203-0405-0607-0809-0a0b0c0d0e0f'
# get the raw 16 bytes of the UUID
>>> x.bytes
'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f'
# make a UUID from a 16-byte string
>>> uuid.UUID(bytes=x.bytes)
UUID('00010203-0405-0607-0809-0a0b0c0d0e0f')
==============================================================================
*py2stdlib-videoreader*
videoreader~
:platform: Mac
:synopsis: Read QuickTime movies frame by frame for further processing.
:deprecated:
videoreader (|py2stdlib-videoreader|) reads and decodes QuickTime movies and passes a stream of
images to your program. It also provides some support for audio tracks.
2.6~
==============================================================================
*py2stdlib-w*
W~
:platform: Mac
:synopsis: Widgets for the Mac, built on top of FrameWork.
:deprecated:
The W (|py2stdlib-w|) widgets are used extensively in the IDE.
2.6~
Obsolete
========
These modules are not normally available for import; additional work must be
done to make them available.
These extension modules written in C are not built by default. Under Unix, these
must be enabled by uncommenting the appropriate lines in Modules/Setup
in the build tree and either rebuilding Python if the modules are statically
linked, or building and installing the shared object if using dynamically-loaded
extensions.
.. (lib-old is empty as of Python 2.5)
Those which are written in Python will be installed into the directory
\file{lib-old/} installed as part of the standard library. To use
these, the directory must be added to \code{sys.path}, possibly using
\envvar{PYTHONPATH}.
--- Measure time intervals to high resolution (use time.clock
instead). Removed in Python 3.x.
SGI-specific Extension modules
==============================
The following are SGI specific, and may be out of touch with the current version
of reality.
--- Interface to the SGI compression library.
--- Interface to the "simple video" board on SGI Indigo (obsolete hardware).
Removed in Python 3.x.
==============================================================================
*py2stdlib-warnings*
warnings~
:synopsis: Issue warning messages and control their disposition.
.. versionadded:: 2.1
Warning messages are typically issued in situations where it is useful to alert
the user of some condition in a program, where that condition (normally) doesn't
warrant raising an exception and terminating the program. For example, one
might want to issue a warning when a program uses an obsolete module.
Python programmers issue warnings by calling the warn function defined
in this module. (C programmers use PyErr_WarnEx; see
exceptionhandling for details).
Warning messages are normally written to ``sys.stderr``, but their disposition
can be changed flexibly, from ignoring all warnings to turning them into
exceptions. The disposition of warnings can vary based on the warning category
(see below), the text of the warning message, and the source location where it
is issued. Repetitions of a particular warning for the same source location are
typically suppressed.
There are two stages in warning control: first, each time a warning is issued, a
determination is made whether a message should be issued or not; next, if a
message is to be issued, it is formatted and printed using a user-settable hook.
The determination whether to issue a warning message is controlled by the
warning filter, which is a sequence of matching rules and actions. Rules can be
added to the filter by calling filterwarnings and reset to its default
state by calling resetwarnings.
The printing of warning messages is done by calling showwarning, which
may be overridden; the default implementation of this function formats the
message by calling formatwarning, which is also available for use by
custom implementations.
Warning Categories
------------------
There are a number of built-in exceptions that represent warning categories.
This categorization is useful to be able to filter out groups of warnings. The
following warnings category classes are currently defined:
+----------------------------------+-----------------------------------------------+
| Class | Description |
+==================================+===============================================+
| Warning | This is the base class of all warning |
| | category classes. It is a subclass of |
| | Exception. |
+----------------------------------+-----------------------------------------------+
| UserWarning | The default category for warn. |
+----------------------------------+-----------------------------------------------+
| DeprecationWarning | Base category for warnings about deprecated |
| | features (ignored by default). |
+----------------------------------+-----------------------------------------------+
| SyntaxWarning | Base category for warnings about dubious |
| | syntactic features. |
+----------------------------------+-----------------------------------------------+
| RuntimeWarning | Base category for warnings about dubious |
| | runtime features. |
+----------------------------------+-----------------------------------------------+
| FutureWarning | Base category for warnings about constructs |
| | that will change semantically in the future. |
+----------------------------------+-----------------------------------------------+
| PendingDeprecationWarning | Base category for warnings about features |
| | that will be deprecated in the future |
| | (ignored by default). |
+----------------------------------+-----------------------------------------------+
| ImportWarning | Base category for warnings triggered during |
| | the process of importing a module (ignored by |
| | default). |
+----------------------------------+-----------------------------------------------+
| UnicodeWarning | Base category for warnings related to |
| | Unicode. |
+----------------------------------+-----------------------------------------------+
While these are technically built-in exceptions, they are documented here,
because conceptually they belong to the warnings mechanism.
User code can define additional warning categories by subclassing one of the
standard warning categories. A warning category must always be a subclass of
the Warning class.
.. versionchanged:: 2.7
DeprecationWarning is ignored by default.
The Warnings Filter
-------------------
The warnings filter controls whether warnings are ignored, displayed, or turned
into errors (raising an exception).
Conceptually, the warnings filter maintains an ordered list of filter
specifications; any specific warning is matched against each filter
specification in the list in turn until a match is found; the match determines
the disposition of the match. Each entry is a tuple of the form ({action},
{message}, {category}, {module}, {lineno}), where:
{ }action* is one of the following strings:
+---------------+----------------------------------------------+
| Value | Disposition |
+===============+==============================================+
| ``"error"`` | turn matching warnings into exceptions |
+---------------+----------------------------------------------+
| ``"ignore"`` | never print matching warnings |
+---------------+----------------------------------------------+
| ``"always"`` | always print matching warnings |
+---------------+----------------------------------------------+
| ``"default"`` | print the first occurrence of matching |
| | warnings for each location where the warning |
| | is issued |
+---------------+----------------------------------------------+
| ``"module"`` | print the first occurrence of matching |
| | warnings for each module where the warning |
| | is issued |
+---------------+----------------------------------------------+
| ``"once"`` | print only the first occurrence of matching |
| | warnings, regardless of location |
+---------------+----------------------------------------------+
{ }message* is a string containing a regular expression that the warning message
must match (the match is compiled to always be case-insensitive).
{ }category* is a class (a subclass of Warning) of which the warning
category must be a subclass in order to match.
{ }module* is a string containing a regular expression that the module name must
match (the match is compiled to be case-sensitive).
{ }lineno* is an integer that the line number where the warning occurred must
match, or ``0`` to match all line numbers.
Since the Warning class is derived from the built-in Exception
class, to turn a warning into an error we simply raise ``category(message)``.
The warnings filter is initialized by -W options passed to the Python
interpreter command line. The interpreter saves the arguments for all
-W options without interpretation in ``sys.warnoptions``; the
warnings (|py2stdlib-warnings|) module parses these when it is first imported (invalid options
are ignored, after printing a message to ``sys.stderr``).
Temporarily Suppressing Warnings
--------------------------------
If you are using code that you know will raise a warning, such as a deprecated
function, but do not want to see the warning, then it is possible to suppress
the warning using the catch_warnings context manager:: >
import warnings
def fxn():
warnings.warn("deprecated", DeprecationWarning)
with warnings.catch_warnings():
warnings.simplefilter("ignore")
fxn()
<
While within the context manager all warnings will simply be ignored. This
allows you to use known-deprecated code without having to see the warning while
not suppressing the warning for other code that might not be aware of its use
of deprecated code. Note: this can only be guaranteed in a single-threaded
application. If two or more threads use the catch_warnings context
manager at the same time, the behavior is undefined.
Testing Warnings
----------------
To test warnings raised by code, use the catch_warnings context
manager. With it you can temporarily mutate the warnings filter to facilitate
your testing. For instance, do the following to capture all raised warnings to
check:: >
import warnings
def fxn():
warnings.warn("deprecated", DeprecationWarning)
with warnings.catch_warnings(record=True) as w:
# Cause all warnings to always be triggered.
warnings.simplefilter("always")
# Trigger a warning.
fxn()
# Verify some things
assert len(w) == 1
assert issubclass(w[-1].category, DeprecationWarning)
assert "deprecated" in str(w[-1].message)
<
One can also cause all warnings to be exceptions by using ``error`` instead of
``always``. One thing to be aware of is that if a warning has already been
raised because of a ``once``/``default`` rule, then no matter what filters are
set the warning will not be seen again unless the warnings registry related to
the warning has been cleared.
Once the context manager exits, the warnings filter is restored to its state
when the context was entered. This prevents tests from changing the warnings
filter in unexpected ways between tests and leading to indeterminate test
results. The showwarning function in the module is also restored to
its original value. Note: this can only be guaranteed in a single-threaded
application. If two or more threads use the catch_warnings context
manager at the same time, the behavior is undefined.
When testing multiple operations that raise the same kind of warning, it
is important to test them in a manner that confirms each operation is raising
a new warning (e.g. set warnings to be raised as exceptions and check the
operations raise exceptions, check that the length of the warning list
continues to increase after each operation, or else delete the previous
entries from the warnings list before each new operation).
Updating Code For New Versions of Python
----------------------------------------
Warnings that are only of interest to the developer are ignored by default. As
such you should make sure to test your code with typically ignored warnings
made visible. You can do this from the command-line by passing -Wd
to the interpreter (this is shorthand for -W default). This enables
default handling for all warnings, including those that are ignored by default.
To change what action is taken for encountered warnings you simply change what
argument is passed to -W, e.g. -W error. See the
-W flag for more details on what is possible.
To programmatically do the same as -Wd, use:: >
warnings.simplefilter('default')
<
Make sure to execute this code as soon as possible. This prevents the
registering of what warnings have been raised from unexpectedly influencing how
future warnings are treated.
Having certain warnings ignored by default is done to prevent a user from
seeing warnings that are only of interest to the developer. As you do not
necessarily have control over what interpreter a user uses to run their code,
it is possible that a new version of Python will be released between your
release cycles. The new interpreter release could trigger new warnings in your
code that were not there in an older interpreter, e.g.
DeprecationWarning for a module that you are using. While you as a
developer want to be notified that your code is using a deprecated module, to a
user this information is essentially noise and provides no benefit to them.
Available Functions
-------------------
warn(message[, category[, stacklevel]])~
Issue a warning, or maybe ignore it or raise an exception. The {category}
argument, if given, must be a warning category class (see above); it defaults to
UserWarning. Alternatively {message} can be a Warning instance,
in which case {category} will be ignored and ``message.__class__`` will be used.
In this case the message text will be ``str(message)``. This function raises an
exception if the particular warning issued is changed into an error by the
warnings filter see above. The {stacklevel} argument can be used by wrapper
functions written in Python, like this:: >
def deprecation(message):
warnings.warn(message, DeprecationWarning, stacklevel=2)
<
This makes the warning refer to deprecation's caller, rather than to the
source of deprecation itself (since the latter would defeat the purpose
of the warning message).
warn_explicit(message, category, filename, lineno[, module[, registry[, module_globals]]])~
This is a low-level interface to the functionality of warn, passing in
explicitly the message, category, filename and line number, and optionally the
module name and the registry (which should be the ``__warningregistry__``
dictionary of the module). The module name defaults to the filename with
``.py`` stripped; if no registry is passed, the warning is never suppressed.
{message} must be a string and {category} a subclass of Warning or
{message} may be a Warning instance, in which case {category} will be
ignored.
{module_globals}, if supplied, should be the global namespace in use by the code
for which the warning is issued. (This argument is used to support displaying
source for modules found in zipfiles or other non-filesystem import
sources).
.. versionchanged:: 2.5
Added the {module_globals} parameter.
warnpy3k(message[, category[, stacklevel]])~
Issue a warning related to Python 3.x deprecation. Warnings are only shown
when Python is started with the -3 option. Like warn {message} must
be a string and {category} a subclass of Warning. warnpy3k
is using DeprecationWarning as default warning class.
.. versionadded:: 2.6
showwarning(message, category, filename, lineno[, file[, line]])~
Write a warning to a file. The default implementation calls
``formatwarning(message, category, filename, lineno, line)`` and writes the
resulting string to {file}, which defaults to ``sys.stderr``. You may replace
this function with an alternative implementation by assigning to
``warnings.showwarning``.
{line} is a line of source code to be included in the warning
message; if {line} is not supplied, showwarning will
try to read the line specified by {filename} and {lineno}.
.. versionchanged:: 2.7
The {line} argument is required to be supported.
formatwarning(message, category, filename, lineno[, line])~
Format a warning the standard way. This returns a string which may contain
embedded newlines and ends in a newline. {line} is a line of source code to
be included in the warning message; if {line} is not supplied,
formatwarning will try to read the line specified by {filename} and
{lineno}.
.. versionchanged:: 2.6
Added the {line} argument.
filterwarnings(action[, message[, category[, module[, lineno[, append]]]]])~
Insert an entry into the list of :ref:`warnings filter specifications
<warning-filter>`. The entry is inserted at the front by default; if
{append} is true, it is inserted at the end. This checks the types of the
arguments, compiles the {message} and {module} regular expressions, and
inserts them as a tuple in the list of warnings filters. Entries closer to
the front of the list override entries later in the list, if both match a
particular warning. Omitted arguments default to a value that matches
everything.
simplefilter(action[, category[, lineno[, append]]])~
Insert a simple entry into the list of :ref:`warnings filter specifications
<warning-filter>`. The meaning of the function parameters is as for
filterwarnings, but regular expressions are not needed as the filter
inserted always matches any message in any module as long as the category and
line number match.
resetwarnings()~
Reset the warnings filter. This discards the effect of all previous calls to
filterwarnings, including that of the -W command line options
and calls to simplefilter.
Available Context Managers
--------------------------
catch_warnings([\*, record=False, module=None])~
A context manager that copies and, upon exit, restores the warnings filter
and the showwarning function.
If the {record} argument is False (the default) the context manager
returns None on entry. If {record} is True, a list is
returned that is progressively populated with objects as seen by a custom
showwarning function (which also suppresses output to ``sys.stdout``).
Each object in the list has attributes with the same names as the arguments to
showwarning.
The {module} argument takes a module that will be used instead of the
module returned when you import warnings (|py2stdlib-warnings|) whose filter will be
protected. This argument exists primarily for testing the warnings (|py2stdlib-warnings|)
module itself.
.. note:: >
The catch_warnings manager works by replacing and
then later restoring the module's
showwarning function and internal list of filter
specifications. This means the context manager is modifying
global state and therefore is not thread-safe.
<
.. note::
In Python 3.0, the arguments to the constructor for
catch_warnings are keyword-only arguments.
.. versionadded:: 2.6
==============================================================================
*py2stdlib-wave*
wave~
:synopsis: Provide an interface to the WAV sound format.
.. Documentations stolen from comments in file.
The wave (|py2stdlib-wave|) module provides a convenient interface to the WAV sound format.
It does not support compression/decompression, but it does support mono/stereo.
The wave (|py2stdlib-wave|) module defines the following function and exception:
open(file[, mode])~
If {file} is a string, open the file by that name, other treat it as a seekable
file-like object. {mode} can be any of
``'r'``, ``'rb'``
Read only mode.
``'w'``, ``'wb'``
Write only mode.
Note that it does not allow read/write WAV files.
A {mode} of ``'r'`` or ``'rb'`` returns a Wave_read object, while a
{mode} of ``'w'`` or ``'wb'`` returns a Wave_write object. If {mode}
is omitted and a file-like object is passed as {file}, ``file.mode`` is used as
the default value for {mode} (the ``'b'`` flag is still added if necessary).
openfp(file, mode)~
A synonym for .open, maintained for backwards compatibility.
Error~
An error raised when something is impossible because it violates the WAV
specification or hits an implementation deficiency.
Wave_read Objects
-----------------
Wave_read objects, as returned by .open, have the following methods:
Wave_read.close()~
Close the stream, and make the instance unusable. This is called automatically
on object collection.
Wave_read.getnchannels()~
Returns number of audio channels (``1`` for mono, ``2`` for stereo).
Wave_read.getsampwidth()~
Returns sample width in bytes.
Wave_read.getframerate()~
Returns sampling frequency.
Wave_read.getnframes()~
Returns number of audio frames.
Wave_read.getcomptype()~
Returns compression type (``'NONE'`` is the only supported type).
Wave_read.getcompname()~
Human-readable version of getcomptype. Usually ``'not compressed'``
parallels ``'NONE'``.
Wave_read.getparams()~
Returns a tuple ``(nchannels, sampwidth, framerate, nframes, comptype,
compname)``, equivalent to output of the get\* methods.
Wave_read.readframes(n)~
Reads and returns at most {n} frames of audio, as a string of bytes.
Wave_read.rewind()~
Rewind the file pointer to the beginning of the audio stream.
The following two methods are defined for compatibility with the aifc (|py2stdlib-aifc|)
module, and don't do anything interesting.
Wave_read.getmarkers()~
Returns ``None``.
Wave_read.getmark(id)~
Raise an error.
The following two methods define a term "position" which is compatible between
them, and is otherwise implementation dependent.
Wave_read.setpos(pos)~
Set the file pointer to the specified position.
Wave_read.tell()~
Return current file pointer position.
Wave_write Objects
------------------
Wave_write objects, as returned by .open, have the following methods:
Wave_write.close()~
Make sure {nframes} is correct, and close the file. This method is called upon
deletion.
Wave_write.setnchannels(n)~
Set the number of channels.
Wave_write.setsampwidth(n)~
Set the sample width to {n} bytes.
Wave_write.setframerate(n)~
Set the frame rate to {n}.
Wave_write.setnframes(n)~
Set the number of frames to {n}. This will be changed later if more frames are
written.
Wave_write.setcomptype(type, name)~
Set the compression type and description. At the moment, only compression type
``NONE`` is supported, meaning no compression.
Wave_write.setparams(tuple)~
The {tuple} should be ``(nchannels, sampwidth, framerate, nframes, comptype,
compname)``, with values valid for the set\* methods. Sets all
parameters.
Wave_write.tell()~
Return current position in the file, with the same disclaimer for the
Wave_read.tell and Wave_read.setpos methods.
Wave_write.writeframesraw(data)~
Write audio frames, without correcting {nframes}.
Wave_write.writeframes(data)~
Write audio frames and make sure {nframes} is correct.
Note that it is invalid to set any parameters after calling writeframes
or writeframesraw, and any attempt to do so will raise
wave.Error.
==============================================================================
*py2stdlib-weakref*
weakref~
:synopsis: Support for weak references and weak dictionaries.
.. versionadded:: 2.1
The weakref (|py2stdlib-weakref|) module allows the Python programmer to create :dfn:`weak
references` to objects.
.. When making changes to the examples in this file, be sure to update
Lib/test/test_weakref.py::libreftest too!
In the following, the term referent means the object which is referred to
by a weak reference.
A weak reference to an object is not enough to keep the object alive: when the
only remaining references to a referent are weak references,
garbage collection is free to destroy the referent and reuse its memory
for something else. A primary use for weak references is to implement caches or
mappings holding large objects, where it's desired that a large object not be
kept alive solely because it appears in a cache or mapping.
For example, if you have a number of large binary image objects, you may wish to
associate a name with each. If you used a Python dictionary to map names to
images, or images to names, the image objects would remain alive just because
they appeared as values or keys in the dictionaries. The
WeakKeyDictionary and WeakValueDictionary classes supplied by
the weakref (|py2stdlib-weakref|) module are an alternative, using weak references to construct
mappings that don't keep objects alive solely because they appear in the mapping
objects. If, for example, an image object is a value in a
WeakValueDictionary, then when the last remaining references to that
image object are the weak references held by weak mappings, garbage collection
can reclaim the object, and its corresponding entries in weak mappings are
simply deleted.
WeakKeyDictionary and WeakValueDictionary use weak references
in their implementation, setting up callback functions on the weak references
that notify the weak dictionaries when a key or value has been reclaimed by
garbage collection. Most programs should find that using one of these weak
dictionary types is all they need -- it's not usually necessary to create your
own weak references directly. The low-level machinery used by the weak
dictionary implementations is exposed by the weakref (|py2stdlib-weakref|) module for the
benefit of advanced uses.
.. note::
Weak references to an object are cleared before the object's __del__
is called, to ensure that the weak reference callback (if any) finds the
object still alive.
Not all objects can be weakly referenced; those objects which can include class
instances, functions written in Python (but not in C), methods (both bound and
unbound), sets, frozensets, file objects, generator\s, type objects,
DBcursor objects from the bsddb (|py2stdlib-bsddb|) module, sockets, arrays, deques,
regular expression pattern objects, and code objects.
.. versionchanged:: 2.4
Added support for files, sockets, arrays, and patterns.
.. versionchanged:: 2.7
Added support for thread.lock, threading.Lock, and code objects.
Several built-in types such as list and dict do not directly
support weak references but can add support through subclassing:: >
class Dict(dict):
pass
obj = Dict(red=1, green=2, blue=3) # this object is weak referenceable
<
.. impl-detail::
Other built-in types such as tuple and long do not support
weak references even when subclassed.
Extension types can easily be made to support weak references; see
weakref-support.
ref(object[, callback])~
Return a weak reference to {object}. The original object can be retrieved by
calling the reference object if the referent is still alive; if the referent is
no longer alive, calling the reference object will cause None to be
returned. If {callback} is provided and not None, and the returned
weakref object is still alive, the callback will be called when the object is
about to be finalized; the weak reference object will be passed as the only
parameter to the callback; the referent will no longer be available.
It is allowable for many weak references to be constructed for the same object.
Callbacks registered for each weak reference will be called from the most
recently registered callback to the oldest registered callback.
Exceptions raised by the callback will be noted on the standard error output,
but cannot be propagated; they are handled in exactly the same way as exceptions
raised from an object's __del__ method.
Weak references are hashable if the {object} is hashable. They will maintain
their hash value even after the {object} was deleted. If hash is called
the first time only after the {object} was deleted, the call will raise
TypeError.
Weak references support tests for equality, but not ordering. If the referents
are still alive, two references have the same equality relationship as their
referents (regardless of the {callback}). If either referent has been deleted,
the references are equal only if the reference objects are the same object.
.. versionchanged:: 2.4
This is now a subclassable type rather than a factory function; it derives from
object.
proxy(object[, callback])~
Return a proxy to {object} which uses a weak reference. This supports use of
the proxy in most contexts instead of requiring the explicit dereferencing used
with weak reference objects. The returned object will have a type of either
``ProxyType`` or ``CallableProxyType``, depending on whether {object} is
callable. Proxy objects are not hashable regardless of the referent; this
avoids a number of problems related to their fundamentally mutable nature, and
prevent their use as dictionary keys. {callback} is the same as the parameter
of the same name to the ref function.
getweakrefcount(object)~
Return the number of weak references and proxies which refer to {object}.
getweakrefs(object)~
Return a list of all weak reference and proxy objects which refer to {object}.
WeakKeyDictionary([dict])~
Mapping class that references keys weakly. Entries in the dictionary will be
discarded when there is no longer a strong reference to the key. This can be
used to associate additional data with an object owned by other parts of an
application without adding attributes to those objects. This can be especially
useful with objects that override attribute accesses.
.. note:: >
Caution: Because a WeakKeyDictionary is built on top of a Python
dictionary, it must not change size when iterating over it. This can be
difficult to ensure for a WeakKeyDictionary because actions
performed by the program during iteration may cause items in the
dictionary to vanish "by magic" (as a side effect of garbage collection).
<
WeakKeyDictionary objects have the following additional methods. These
expose the internal references directly. The references are not guaranteed to
be "live" at the time they are used, so the result of calling the references
needs to be checked before being used. This can be used to avoid creating
references that will cause the garbage collector to keep the keys around longer
than needed.
WeakKeyDictionary.iterkeyrefs()~
Return an iterator that yields the weak references to the keys.
.. versionadded:: 2.5
WeakKeyDictionary.keyrefs()~
Return a list of weak references to the keys.
.. versionadded:: 2.5
WeakValueDictionary([dict])~
Mapping class that references values weakly. Entries in the dictionary will be
discarded when no strong reference to the value exists any more.
.. note:: >
Caution: Because a WeakValueDictionary is built on top of a Python
dictionary, it must not change size when iterating over it. This can be
difficult to ensure for a WeakValueDictionary because actions performed
by the program during iteration may cause items in the dictionary to vanish "by
magic" (as a side effect of garbage collection).
<
WeakValueDictionary objects have the following additional methods.
These method have the same issues as the iterkeyrefs and keyrefs
methods of WeakKeyDictionary objects.
WeakValueDictionary.itervaluerefs()~
Return an iterator that yields the weak references to the values.
.. versionadded:: 2.5
WeakValueDictionary.valuerefs()~
Return a list of weak references to the values.
.. versionadded:: 2.5
WeakSet([elements])~
Set class that keeps weak references to its elements. An element will be
discarded when no strong reference to it exists any more.
.. versionadded:: 2.7
ReferenceType~
The type object for weak references objects.
ProxyType~
The type object for proxies of objects which are not callable.
CallableProxyType~
The type object for proxies of callable objects.
ProxyTypes~
Sequence containing all the type objects for proxies. This can make it simpler
to test if an object is a proxy without being dependent on naming both proxy
types.
ReferenceError~
Exception raised when a proxy object is used but the underlying object has been
collected. This is the same as the standard ReferenceError exception.
.. seealso::
0205 - Weak References
The proposal and rationale for this feature, including links to earlier
implementations and information about similar features in other languages.
Weak Reference Objects
----------------------
Weak reference objects have no attributes or methods, but do allow the referent
to be obtained, if it still exists, by calling it:
>>> import weakref
>>> class Object:
... pass
...
>>> o = Object()
>>> r = weakref.ref(o)
>>> o2 = r()
>>> o is o2
True
If the referent no longer exists, calling the reference object returns
None:
>>> del o, o2
>>> print r()
None
Testing that a weak reference object is still live should be done using the
expression ``ref() is not None``. Normally, application code that needs to use
a reference object should follow this pattern:: >
# r is a weak reference object
o = r()
if o is None:
# referent has been garbage collected
print "Object has been deallocated; can't frobnicate."
else:
print "Object is still live!"
o.do_something_useful()
<
Using a separate test for "liveness" creates race conditions in threaded
applications; another thread can cause a weak reference to become invalidated
before the weak reference is called; the idiom shown above is safe in threaded
applications as well as single-threaded applications.
Specialized versions of ref objects can be created through subclassing.
This is used in the implementation of the WeakValueDictionary to reduce
the memory overhead for each entry in the mapping. This may be most useful to
associate additional information with a reference, but could also be used to
insert additional processing on calls to retrieve the referent.
This example shows how a subclass of ref can be used to store
additional information about an object and affect the value that's returned when
the referent is accessed:: >
import weakref
class ExtendedRef(weakref.ref):
def __init__(self, ob, callback=None, {}annotations):
super(ExtendedRef, self).__init__(ob, callback)
self.__counter = 0
for k, v in annotations.iteritems():
setattr(self, k, v)
def __call__(self):
"""Return a pair containing the referent and the number of
times the reference has been called.
"""
ob = super(ExtendedRef, self).__call__()
if ob is not None:
self.__counter += 1
ob = (ob, self.__counter)
return ob
<
Example
This simple example shows how an application can use objects IDs to retrieve
objects that it has seen before. The IDs of the objects can then be used in
other data structures without forcing the objects to remain alive, but the
objects can still be retrieved by ID if they do.
.. Example contributed by Tim Peters.
:: >
import weakref
_id2obj_dict = weakref.WeakValueDictionary()
def remember(obj):
oid = id(obj)
_id2obj_dict[oid] = obj
return oid
def id2obj(oid):
return _id2obj_dict[oid]
==============================================================================
*py2stdlib-webbrowser*
webbrowser~
:synopsis: Easy-to-use controller for Web browsers.
The webbrowser (|py2stdlib-webbrowser|) module provides a high-level interface to allow displaying
Web-based documents to users. Under most circumstances, simply calling the
.open function from this module will do the right thing.
Under Unix, graphical browsers are preferred under X11, but text-mode browsers
will be used if graphical browsers are not available or an X11 display isn't
available. If text-mode browsers are used, the calling process will block until
the user exits the browser.
If the environment variable BROWSER exists, it is interpreted to
override the platform default list of browsers, as a os.pathsep-separated
list of browsers to try in order. When the value of a list part contains the
string ``%s``, then it is interpreted as a literal browser command line to be
used with the argument URL substituted for ``%s``; if the part does not contain
``%s``, it is simply interpreted as the name of the browser to launch. [1]_
For non-Unix platforms, or when a remote browser is available on Unix, the
controlling process will not wait for the user to finish with the browser, but
allow the remote browser to maintain its own windows on the display. If remote
browsers are not available on Unix, the controlling process will launch a new
browser and wait.
The script webbrowser (|py2stdlib-webbrowser|) can be used as a command-line interface for the
module. It accepts an URL as the argument. It accepts the following optional
parameters: -n opens the URL in a new browser window, if possible;
-t opens the URL in a new browser page ("tab"). The options are,
naturally, mutually exclusive.
The following exception is defined:
Error~
Exception raised when a browser control error occurs.
The following functions are defined:
open(url[, new=0[, autoraise=True]])~
Display {url} using the default browser. If {new} is 0, the {url} is opened
in the same browser window if possible. If {new} is 1, a new browser window
is opened if possible. If {new} is 2, a new browser page ("tab") is opened
if possible. If {autoraise} is ``True``, the window is raised if possible
(note that under many window managers this will occur regardless of the
setting of this variable).
Note that on some platforms, trying to open a filename using this function,
may work and start the operating system's associated program. However, this
is neither supported nor portable.
.. versionchanged:: 2.5
{new} can now be 2.
open_new(url)~
Open {url} in a new window of the default browser, if possible, otherwise, open
{url} in the only browser window.
open_new_tab(url)~
Open {url} in a new page ("tab") of the default browser, if possible, otherwise
equivalent to open_new.
.. versionadded:: 2.5
get([name])~
Return a controller object for the browser type {name}. If {name} is empty,
return a controller for a default browser appropriate to the caller's
environment.
register(name, constructor[, instance])~
Register the browser type {name}. Once a browser type is registered, the
get function can return a controller for that browser type. If
{instance} is not provided, or is ``None``, {constructor} will be called without
parameters to create an instance when needed. If {instance} is provided,
{constructor} will never be called, and may be ``None``.
This entry point is only useful if you plan to either set the BROWSER
variable or call get with a nonempty argument matching the name of a
handler you declare.
A number of browser types are predefined. This table gives the type names that
may be passed to the get function and the corresponding instantiations
for the controller classes, all defined in this module.
+-----------------------+-----------------------------------------+-------+
| Type Name | Class Name | Notes |
+=======================+=========================================+=======+
| ``'mozilla'`` | Mozilla('mozilla') | |
+-----------------------+-----------------------------------------+-------+
| ``'firefox'`` | Mozilla('mozilla') | |
+-----------------------+-----------------------------------------+-------+
| ``'netscape'`` | Mozilla('netscape') | |
+-----------------------+-----------------------------------------+-------+
| ``'galeon'`` | Galeon('galeon') | |
+-----------------------+-----------------------------------------+-------+
| ``'epiphany'`` | Galeon('epiphany') | |
+-----------------------+-----------------------------------------+-------+
| ``'skipstone'`` | BackgroundBrowser('skipstone') | |
+-----------------------+-----------------------------------------+-------+
| ``'kfmclient'`` | Konqueror() | \(1) |
+-----------------------+-----------------------------------------+-------+
| ``'konqueror'`` | Konqueror() | \(1) |
+-----------------------+-----------------------------------------+-------+
| ``'kfm'`` | Konqueror() | \(1) |
+-----------------------+-----------------------------------------+-------+
| ``'mosaic'`` | BackgroundBrowser('mosaic') | |
+-----------------------+-----------------------------------------+-------+
| ``'opera'`` | Opera() | |
+-----------------------+-----------------------------------------+-------+
| ``'grail'`` | Grail() | |
+-----------------------+-----------------------------------------+-------+
| ``'links'`` | GenericBrowser('links') | |
+-----------------------+-----------------------------------------+-------+
| ``'elinks'`` | Elinks('elinks') | |
+-----------------------+-----------------------------------------+-------+
| ``'lynx'`` | GenericBrowser('lynx') | |
+-----------------------+-----------------------------------------+-------+
| ``'w3m'`` | GenericBrowser('w3m') | |
+-----------------------+-----------------------------------------+-------+
| ``'windows-default'`` | WindowsDefault | \(2) |
+-----------------------+-----------------------------------------+-------+
| ``'internet-config'`` | InternetConfig | \(3) |
+-----------------------+-----------------------------------------+-------+
| ``'macosx'`` | MacOSX('default') | \(4) |
+-----------------------+-----------------------------------------+-------+
Notes:
(1)
"Konqueror" is the file manager for the KDE desktop environment for Unix, and
only makes sense to use if KDE is running. Some way of reliably detecting KDE
would be nice; the KDEDIR variable is not sufficient. Note also that
the name "kfm" is used even when using the konqueror command with KDE
2 --- the implementation selects the best strategy for running Konqueror.
(2)
Only on Windows platforms.
(3)
Only on Mac OS platforms; requires the standard MacPython ic (|py2stdlib-ic|) module.
(4)
Only on Mac OS X platform.
Here are some simple examples:: >
url = 'http://www.python.org/'
# Open URL in a new tab, if a browser window is already open.
webbrowser.open_new_tab(url + 'doc/')
# Open URL in new window, raising the window if possible.
webbrowser.open_new(url)
<
Browser Controller Objects
Browser controllers provide these methods which parallel three of the
module-level convenience functions:
controller.open(url[, new=0[, autoraise=True]])~
Display {url} using the browser handled by this controller. If {new} is 1, a new
browser window is opened if possible. If {new} is 2, a new browser page ("tab")
is opened if possible.
controller.open_new(url)~
Open {url} in a new window of the browser handled by this controller, if
possible, otherwise, open {url} in the only browser window. Alias
open_new.
controller.open_new_tab(url)~
Open {url} in a new page ("tab") of the browser handled by this controller, if
possible, otherwise equivalent to open_new.
.. versionadded:: 2.5
.. rubric:: Footnotes
.. [1] Executables named here without a full path will be searched in the
directories given in the PATH environment variable.
==============================================================================
*py2stdlib-whichdb*
whichdb~
:synopsis: Guess which DBM-style module created a given database.
.. note::
The whichdb (|py2stdlib-whichdb|) module's only function has been put into the dbm (|py2stdlib-dbm|)
module in Python 3.0. The 2to3 tool will automatically adapt imports
when converting your sources to 3.0.
The single function in this module attempts to guess which of the several simple
database modules available--\ dbm (|py2stdlib-dbm|), gdbm (|py2stdlib-gdbm|), or dbhash (|py2stdlib-dbhash|)\
--should be used to open a given file.
whichdb(filename)~
Returns one of the following values: ``None`` if the file can't be opened
because it's unreadable or doesn't exist; the empty string (``''``) if the
file's format can't be guessed; or a string containing the required module name,
such as ``'dbm'`` or ``'gdbm'``.
==============================================================================
*py2stdlib-winsound*
winsound~
:platform: Windows
:synopsis: Access to the sound-playing machinery for Windows.
.. versionadded:: 1.5.2
The winsound (|py2stdlib-winsound|) module provides access to the basic sound-playing machinery
provided by Windows platforms. It includes functions and several constants.
Beep(frequency, duration)~
Beep the PC's speaker. The {frequency} parameter specifies frequency, in hertz,
of the sound, and must be in the range 37 through 32,767. The {duration}
parameter specifies the number of milliseconds the sound should last. If the
system is not able to beep the speaker, RuntimeError is raised.
.. versionadded:: 1.6
PlaySound(sound, flags)~
Call the underlying PlaySound function from the Platform API. The
{sound} parameter may be a filename, audio data as a string, or ``None``. Its
interpretation depends on the value of {flags}, which can be a bitwise ORed
combination of the constants described below. If the {sound} parameter is
``None``, any currently playing waveform sound is stopped. If the system
indicates an error, RuntimeError is raised.
MessageBeep([type=MB_OK])~
Call the underlying MessageBeep function from the Platform API. This
plays a sound as specified in the registry. The {type} argument specifies which
sound to play; possible values are ``-1``, ``MB_ICONASTERISK``,
``MB_ICONEXCLAMATION``, ``MB_ICONHAND``, ``MB_ICONQUESTION``, and ``MB_OK``, all
described below. The value ``-1`` produces a "simple beep"; this is the final
fallback if a sound cannot be played otherwise.
.. versionadded:: 2.3
SND_FILENAME~
The {sound} parameter is the name of a WAV file. Do not use with
SND_ALIAS.
SND_ALIAS~
The {sound} parameter is a sound association name from the registry. If the
registry contains no such name, play the system default sound unless
SND_NODEFAULT is also specified. If no default sound is registered,
raise RuntimeError. Do not use with SND_FILENAME.
All Win32 systems support at least the following; most systems support many
more:
+--------------------------+----------------------------------------+
| PlaySound {name} | Corresponding Control Panel Sound name |
+==========================+========================================+
| ``'SystemAsterisk'`` | Asterisk |
+--------------------------+----------------------------------------+
| ``'SystemExclamation'`` | Exclamation |
+--------------------------+----------------------------------------+
| ``'SystemExit'`` | Exit Windows |
+--------------------------+----------------------------------------+
| ``'SystemHand'`` | Critical Stop |
+--------------------------+----------------------------------------+
| ``'SystemQuestion'`` | Question |
+--------------------------+----------------------------------------+
For example:: >
import winsound
# Play Windows exit sound.
winsound.PlaySound("SystemExit", winsound.SND_ALIAS)
# Probably play Windows default sound, if any is registered (because
# "*" probably isn't the registered name of any sound).
winsound.PlaySound("*", winsound.SND_ALIAS)
<
SND_LOOP~
Play the sound repeatedly. The SND_ASYNC flag must also be used to
avoid blocking. Cannot be used with SND_MEMORY.
SND_MEMORY~
The {sound} parameter to PlaySound is a memory image of a WAV file, as a
string.
.. note:: >
This module does not support playing from a memory image asynchronously, so a
combination of this flag and SND_ASYNC will raise RuntimeError.
<
SND_PURGE~
Stop playing all instances of the specified sound.
.. note:: >
This flag is not supported on modern Windows platforms.
<
SND_ASYNC~
Return immediately, allowing sounds to play asynchronously.
SND_NODEFAULT~
If the specified sound cannot be found, do not play the system default sound.
SND_NOSTOP~
Do not interrupt sounds currently playing.
SND_NOWAIT~
Return immediately if the sound driver is busy.
MB_ICONASTERISK~
Play the ``SystemDefault`` sound.
MB_ICONEXCLAMATION~
Play the ``SystemExclamation`` sound.
MB_ICONHAND~
Play the ``SystemHand`` sound.
MB_ICONQUESTION~
Play the ``SystemQuestion`` sound.
MB_OK~
Play the ``SystemDefault`` sound.
==============================================================================
*py2stdlib-wsgiref*
wsgiref~
:synopsis: WSGI Utilities and Reference Implementation.
.. versionadded:: 2.5
The Web Server Gateway Interface (WSGI) is a standard interface between web
server software and web applications written in Python. Having a standard
interface makes it easy to use an application that supports WSGI with a number
of different web servers.
Only authors of web servers and programming frameworks need to know every detail
and corner case of the WSGI design. You don't need to understand every detail
of WSGI just to install a WSGI application or to write a web application using
an existing framework.
wsgiref (|py2stdlib-wsgiref|) is a reference implementation of the WSGI specification that can
be used to add WSGI support to a web server or framework. It provides utilities
for manipulating WSGI environment variables and response headers, base classes
for implementing WSGI servers, a demo HTTP server that serves WSGI applications,
and a validation tool that checks WSGI servers and applications for conformance
to the WSGI specification (333).
See http://www.wsgi.org for more information about WSGI, and links to tutorials
and other resources.
.. XXX If you're just trying to write a web application...
wsgiref.util (|py2stdlib-wsgiref.util|) -- WSGI environment utilities
-------------------------------------------------
==============================================================================
*py2stdlib-wsgiref.util*
wsgiref.util~
:synopsis: WSGI environment utilities.
This module provides a variety of utility functions for working with WSGI
environments. A WSGI environment is a dictionary containing HTTP request
variables as described in 333. All of the functions taking an {environ}
parameter expect a WSGI-compliant dictionary to be supplied; please see
333 for a detailed specification.
guess_scheme(environ)~
Return a guess for whether ``wsgi.url_scheme`` should be "http" or "https", by
checking for a ``HTTPS`` environment variable in the {environ} dictionary. The
return value is a string.
This function is useful when creating a gateway that wraps CGI or a CGI-like
protocol such as FastCGI. Typically, servers providing such protocols will
include a ``HTTPS`` variable with a value of "1" "yes", or "on" when a request
is received via SSL. So, this function returns "https" if such a value is
found, and "http" otherwise.
request_uri(environ [, include_query=1])~
Return the full request URI, optionally including the query string, using the
algorithm found in the "URL Reconstruction" section of 333. If
{include_query} is false, the query string is not included in the resulting URI.
application_uri(environ)~
Similar to request_uri, except that the ``PATH_INFO`` and
``QUERY_STRING`` variables are ignored. The result is the base URI of the
application object addressed by the request.
shift_path_info(environ)~
Shift a single name from ``PATH_INFO`` to ``SCRIPT_NAME`` and return the name.
The {environ} dictionary is {modified} in-place; use a copy if you need to keep
the original ``PATH_INFO`` or ``SCRIPT_NAME`` intact.
If there are no remaining path segments in ``PATH_INFO``, ``None`` is returned.
Typically, this routine is used to process each portion of a request URI path,
for example to treat the path as a series of dictionary keys. This routine
modifies the passed-in environment to make it suitable for invoking another WSGI
application that is located at the target URI. For example, if there is a WSGI
application at ``/foo``, and the request URI path is ``/foo/bar/baz``, and the
WSGI application at ``/foo`` calls shift_path_info, it will receive the
string "bar", and the environment will be updated to be suitable for passing to
a WSGI application at ``/foo/bar``. That is, ``SCRIPT_NAME`` will change from
``/foo`` to ``/foo/bar``, and ``PATH_INFO`` will change from ``/bar/baz`` to
``/baz``.
When ``PATH_INFO`` is just a "/", this routine returns an empty string and
appends a trailing slash to ``SCRIPT_NAME``, even though empty path segments are
normally ignored, and ``SCRIPT_NAME`` doesn't normally end in a slash. This is
intentional behavior, to ensure that an application can tell the difference
between URIs ending in ``/x`` from ones ending in ``/x/`` when using this
routine to do object traversal.
setup_testing_defaults(environ)~
Update {environ} with trivial defaults for testing purposes.
This routine adds various parameters required for WSGI, including ``HTTP_HOST``,
``SERVER_NAME``, ``SERVER_PORT``, ``REQUEST_METHOD``, ``SCRIPT_NAME``,
``PATH_INFO``, and all of the 333\ -defined ``wsgi.*`` variables. It
only supplies default values, and does not replace any existing settings for
these variables.
This routine is intended to make it easier for unit tests of WSGI servers and
applications to set up dummy environments. It should NOT be used by actual WSGI
servers or applications, since the data is fake!
Example usage:: >
from wsgiref.util import setup_testing_defaults
from wsgiref.simple_server import make_server
# A relatively simple WSGI application. It's going to print out the
# environment dictionary after being updated by setup_testing_defaults
def simple_app(environ, start_response):
setup_testing_defaults(environ)
status = '200 OK'
headers = [('Content-type', 'text/plain')]
start_response(status, headers)
ret = ["%s: %s\n" % (key, value)
for key, value in environ.iteritems()]
return ret
httpd = make_server('', 8000, simple_app)
print "Serving on port 8000..."
httpd.serve_forever()
<
In addition to the environment functions above, the wsgiref.util (|py2stdlib-wsgiref.util|) module
also provides these miscellaneous utilities:
is_hop_by_hop(header_name)~
Return true if 'header_name' is an HTTP/1.1 "Hop-by-Hop" header, as defined by
2616.
FileWrapper(filelike [, blksize=8192])~
A wrapper to convert a file-like object to an iterator. The resulting objects
support both __getitem__ and __iter__ iteration styles, for
compatibility with Python 2.1 and Jython. As the object is iterated over, the
optional {blksize} parameter will be repeatedly passed to the {filelike}
object's read method to obtain strings to yield. When read
returns an empty string, iteration is ended and is not resumable.
If {filelike} has a close method, the returned object will also have a
close method, and it will invoke the {filelike} object's close
method when called.
Example usage:: >
from StringIO import StringIO
from wsgiref.util import FileWrapper
# We're using a StringIO-buffer for as the file-like object
filelike = StringIO("This is an example file-like object"*10)
wrapper = FileWrapper(filelike, blksize=5)
for chunk in wrapper:
print chunk
<
wsgiref.headers (|py2stdlib-wsgiref.headers|) -- WSGI response header tools
==============================================================================
*py2stdlib-wsgiref.headers*
wsgiref.headers~
:synopsis: WSGI response header tools.
This module provides a single class, Headers, for convenient
manipulation of WSGI response headers using a mapping-like interface.
Headers(headers)~
Create a mapping-like object wrapping {headers}, which must be a list of header
name/value tuples as described in 333. Any changes made to the new
Headers object will directly update the {headers} list it was created
with.
Headers objects support typical mapping operations including
__getitem__, get, __setitem__, setdefault,
__delitem__, __contains__ and has_key. For each of
these methods, the key is the header name (treated case-insensitively), and the
value is the first value associated with that header name. Setting a header
deletes any existing values for that header, then adds a new value at the end of
the wrapped header list. Headers' existing order is generally maintained, with
new headers added to the end of the wrapped list.
Unlike a dictionary, Headers objects do not raise an error when you try
to get or delete a key that isn't in the wrapped header list. Getting a
nonexistent header just returns ``None``, and deleting a nonexistent header does
nothing.
Headers objects also support keys, values, and
items methods. The lists returned by keys and items can
include the same key more than once if there is a multi-valued header. The
``len()`` of a Headers object is the same as the length of its
items, which is the same as the length of the wrapped header list. In
fact, the items method just returns a copy of the wrapped header list.
Calling ``str()`` on a Headers object returns a formatted string
suitable for transmission as HTTP response headers. Each header is placed on a
line with its value, separated by a colon and a space. Each line is terminated
by a carriage return and line feed, and the string is terminated with a blank
line.
In addition to their mapping interface and formatting features, Headers
objects also have the following methods for querying and adding multi-valued
headers, and for adding headers with MIME parameters:
Headers.get_all(name)~
Return a list of all the values for the named header.
The returned list will be sorted in the order they appeared in the original
header list or were added to this instance, and may contain duplicates. Any
fields deleted and re-inserted are always appended to the header list. If no
fields exist with the given name, returns an empty list.
Headers.add_header(name, value, {}_params)~
Add a (possibly multi-valued) header, with optional MIME parameters specified
via keyword arguments.
{name} is the header field to add. Keyword arguments can be used to set MIME
parameters for the header field. Each parameter must be a string or ``None``.
Underscores in parameter names are converted to dashes, since dashes are illegal
in Python identifiers, but many MIME parameter names include dashes. If the
parameter value is a string, it is added to the header value parameters in the
form ``name="value"``. If it is ``None``, only the parameter name is added.
(This is used for MIME parameters without a value.) Example usage:: >
h.add_header('content-disposition', 'attachment', filename='bud.gif')
<
The above will add a header that looks like this::
Content-Disposition: attachment; filename="bud.gif"
wsgiref.simple_server (|py2stdlib-wsgiref.simple_server|) -- a simple WSGI HTTP server
---------------------------------------------------------
==============================================================================
*py2stdlib-wsgiref.simple_server*
wsgiref.simple_server~
:synopsis: A simple WSGI HTTP server.
This module implements a simple HTTP server (based on BaseHTTPServer (|py2stdlib-basehttpserver|))
that serves WSGI applications. Each server instance serves a single WSGI
application on a given host and port. If you want to serve multiple
applications on a single host and port, you should create a WSGI application
that parses ``PATH_INFO`` to select which application to invoke for each
request. (E.g., using the shift_path_info function from
wsgiref.util (|py2stdlib-wsgiref.util|).)
make_server(host, port, app [, server_class=WSGIServer [, handler_class=WSGIRequestHandler]])~
Create a new WSGI server listening on {host} and {port}, accepting connections
for {app}. The return value is an instance of the supplied {server_class}, and
will process requests using the specified {handler_class}. {app} must be a WSGI
application object, as defined by 333.
Example usage:: >
from wsgiref.simple_server import make_server, demo_app
httpd = make_server('', 8000, demo_app)
print "Serving HTTP on port 8000..."
# Respond to requests until process is killed
httpd.serve_forever()
# Alternative: serve one request, then exit
httpd.handle_request()
<
demo_app(environ, start_response)~
This function is a small but complete WSGI application that returns a text page
containing the message "Hello world!" and a list of the key/value pairs provided
in the {environ} parameter. It's useful for verifying that a WSGI server (such
as wsgiref.simple_server (|py2stdlib-wsgiref.simple_server|)) is able to run a simple WSGI application
correctly.
WSGIServer(server_address, RequestHandlerClass)~
Create a WSGIServer instance. {server_address} should be a
``(host,port)`` tuple, and {RequestHandlerClass} should be the subclass of
BaseHTTPServer.BaseHTTPRequestHandler that will be used to process
requests.
You do not normally need to call this constructor, as the make_server
function can handle all the details for you.
WSGIServer is a subclass of BaseHTTPServer.HTTPServer, so all
of its methods (such as serve_forever and handle_request) are
available. WSGIServer also provides these WSGI-specific methods:
WSGIServer.set_app(application)~
Sets the callable {application} as the WSGI application that will receive
requests.
WSGIServer.get_app()~
Returns the currently-set application callable.
Normally, however, you do not need to use these additional methods, as
set_app is normally called by make_server, and the
get_app exists mainly for the benefit of request handler instances.
WSGIRequestHandler(request, client_address, server)~
Create an HTTP handler for the given {request} (i.e. a socket), {client_address}
(a ``(host,port)`` tuple), and {server} (WSGIServer instance).
You do not need to create instances of this class directly; they are
automatically created as needed by WSGIServer objects. You can,
however, subclass this class and supply it as a {handler_class} to the
make_server function. Some possibly relevant methods for overriding in
subclasses:
WSGIRequestHandler.get_environ()~
Returns a dictionary containing the WSGI environment for a request. The default
implementation copies the contents of the WSGIServer object's
base_environ dictionary attribute and then adds various headers derived
from the HTTP request. Each call to this method should return a new dictionary
containing all of the relevant CGI environment variables as specified in
333.
WSGIRequestHandler.get_stderr()~
Return the object that should be used as the ``wsgi.errors`` stream. The default
implementation just returns ``sys.stderr``.
WSGIRequestHandler.handle()~
Process the HTTP request. The default implementation creates a handler instance
using a wsgiref.handlers (|py2stdlib-wsgiref.handlers|) class to implement the actual WSGI application
interface.
wsgiref.validate (|py2stdlib-wsgiref.validate|) --- WSGI conformance checker
----------------------------------------------------
==============================================================================
*py2stdlib-wsgiref.validate*
wsgiref.validate~
:synopsis: WSGI conformance checker.
When creating new WSGI application objects, frameworks, servers, or middleware,
it can be useful to validate the new code's conformance using
wsgiref.validate (|py2stdlib-wsgiref.validate|). This module provides a function that creates WSGI
application objects that validate communications between a WSGI server or
gateway and a WSGI application object, to check both sides for protocol
conformance.
Note that this utility does not guarantee complete 333 compliance; an
absence of errors from this module does not necessarily mean that errors do not
exist. However, if this module does produce an error, then it is virtually
certain that either the server or application is not 100% compliant.
This module is based on the paste.lint module from Ian Bicking's "Python
Paste" library.
validator(application)~
Wrap {application} and return a new WSGI application object. The returned
application will forward all requests to the original {application}, and will
check that both the {application} and the server invoking it are conforming to
the WSGI specification and to RFC 2616.
Any detected nonconformance results in an AssertionError being raised;
note, however, that how these errors are handled is server-dependent. For
example, wsgiref.simple_server (|py2stdlib-wsgiref.simple_server|) and other servers based on
wsgiref.handlers (|py2stdlib-wsgiref.handlers|) (that don't override the error handling methods to do
something else) will simply output a message that an error has occurred, and
dump the traceback to ``sys.stderr`` or some other error stream.
This wrapper may also generate output using the warnings (|py2stdlib-warnings|) module to
indicate behaviors that are questionable but which may not actually be
prohibited by 333. Unless they are suppressed using Python command-line
options or the warnings (|py2stdlib-warnings|) API, any such warnings will be written to
``sys.stderr`` ({not} ``wsgi.errors``, unless they happen to be the same
object).
Example usage:: >
from wsgiref.validate import validator
from wsgiref.simple_server import make_server
# Our callable object which is intentionally not compliant to the
# standard, so the validator is going to break
def simple_app(environ, start_response):
status = '200 OK' # HTTP Status
headers = [('Content-type', 'text/plain')] # HTTP Headers
start_response(status, headers)
# This is going to break because we need to return a list, and
# the validator is going to inform us
return "Hello World"
# This is the application wrapped in a validator
validator_app = validator(simple_app)
httpd = make_server('', 8000, validator_app)
print "Listening on port 8000...."
httpd.serve_forever()
<
wsgiref.handlers (|py2stdlib-wsgiref.handlers|) -- server/gateway base classes
==============================================================================
*py2stdlib-wsgiref.handlers*
wsgiref.handlers~
:synopsis: WSGI server/gateway base classes.
This module provides base handler classes for implementing WSGI servers and
gateways. These base classes handle most of the work of communicating with a
WSGI application, as long as they are given a CGI-like environment, along with
input, output, and error streams.
CGIHandler()~
CGI-based invocation via ``sys.stdin``, ``sys.stdout``, ``sys.stderr`` and
``os.environ``. This is useful when you have a WSGI application and want to run
it as a CGI script. Simply invoke ``CGIHandler().run(app)``, where ``app`` is
the WSGI application object you wish to invoke.
This class is a subclass of BaseCGIHandler that sets ``wsgi.run_once``
to true, ``wsgi.multithread`` to false, and ``wsgi.multiprocess`` to true, and
always uses sys (|py2stdlib-sys|) and os (|py2stdlib-os|) to obtain the necessary CGI streams and
environment.
BaseCGIHandler(stdin, stdout, stderr, environ [, multithread=True [, multiprocess=False]])~
Similar to CGIHandler, but instead of using the sys (|py2stdlib-sys|) and
os (|py2stdlib-os|) modules, the CGI environment and I/O streams are specified explicitly.
The {multithread} and {multiprocess} values are used to set the
``wsgi.multithread`` and ``wsgi.multiprocess`` flags for any applications run by
the handler instance.
This class is a subclass of SimpleHandler intended for use with
software other than HTTP "origin servers". If you are writing a gateway
protocol implementation (such as CGI, FastCGI, SCGI, etc.) that uses a
``Status:`` header to send an HTTP status, you probably want to subclass this
instead of SimpleHandler.
SimpleHandler(stdin, stdout, stderr, environ [,multithread=True [, multiprocess=False]])~
Similar to BaseCGIHandler, but designed for use with HTTP origin
servers. If you are writing an HTTP server implementation, you will probably
want to subclass this instead of BaseCGIHandler
This class is a subclass of BaseHandler. It overrides the
__init__, get_stdin, get_stderr, add_cgi_vars,
_write, and _flush methods to support explicitly setting the
environment and streams via the constructor. The supplied environment and
streams are stored in the stdin, stdout, stderr, and
environ attributes.
BaseHandler()~
This is an abstract base class for running WSGI applications. Each instance
will handle a single HTTP request, although in principle you could create a
subclass that was reusable for multiple requests.
BaseHandler instances have only one method intended for external use:
BaseHandler.run(app)~
Run the specified WSGI application, {app}.
All of the other BaseHandler methods are invoked by this method in the
process of running the application, and thus exist primarily to allow
customizing the process.
The following methods MUST be overridden in a subclass:
BaseHandler._write(data)~
Buffer the string {data} for transmission to the client. It's okay if this
method actually transmits the data; BaseHandler just separates write
and flush operations for greater efficiency when the underlying system actually
has such a distinction.
BaseHandler._flush()~
Force buffered data to be transmitted to the client. It's okay if this method
is a no-op (i.e., if _write actually sends the data).
BaseHandler.get_stdin()~
Return an input stream object suitable for use as the ``wsgi.input`` of the
request currently being processed.
BaseHandler.get_stderr()~
Return an output stream object suitable for use as the ``wsgi.errors`` of the
request currently being processed.
BaseHandler.add_cgi_vars()~
Insert CGI variables for the current request into the environ attribute.
Here are some other methods and attributes you may wish to override. This list
is only a summary, however, and does not include every method that can be
overridden. You should consult the docstrings and source code for additional
information before attempting to create a customized BaseHandler
subclass.
Attributes and methods for customizing the WSGI environment:
BaseHandler.wsgi_multithread~
The value to be used for the ``wsgi.multithread`` environment variable. It
defaults to true in BaseHandler, but may have a different default (or
be set by the constructor) in the other subclasses.
BaseHandler.wsgi_multiprocess~
The value to be used for the ``wsgi.multiprocess`` environment variable. It
defaults to true in BaseHandler, but may have a different default (or
be set by the constructor) in the other subclasses.
BaseHandler.wsgi_run_once~
The value to be used for the ``wsgi.run_once`` environment variable. It
defaults to false in BaseHandler, but CGIHandler sets it to
true by default.
BaseHandler.os_environ~
The default environment variables to be included in every request's WSGI
environment. By default, this is a copy of ``os.environ`` at the time that
wsgiref.handlers (|py2stdlib-wsgiref.handlers|) was imported, but subclasses can either create their own
at the class or instance level. Note that the dictionary should be considered
read-only, since the default value is shared between multiple classes and
instances.
BaseHandler.server_software~
If the origin_server attribute is set, this attribute's value is used to
set the default ``SERVER_SOFTWARE`` WSGI environment variable, and also to set a
default ``Server:`` header in HTTP responses. It is ignored for handlers (such
as BaseCGIHandler and CGIHandler) that are not HTTP origin
servers.
BaseHandler.get_scheme()~
Return the URL scheme being used for the current request. The default
implementation uses the guess_scheme function from wsgiref.util (|py2stdlib-wsgiref.util|)
to guess whether the scheme should be "http" or "https", based on the current
request's environ variables.
BaseHandler.setup_environ()~
Set the environ attribute to a fully-populated WSGI environment. The
default implementation uses all of the above methods and attributes, plus the
get_stdin, get_stderr, and add_cgi_vars methods and the
wsgi_file_wrapper attribute. It also inserts a ``SERVER_SOFTWARE`` key
if not present, as long as the origin_server attribute is a true value
and the server_software attribute is set.
Methods and attributes for customizing exception handling:
BaseHandler.log_exception(exc_info)~
Log the {exc_info} tuple in the server log. {exc_info} is a ``(type, value,
traceback)`` tuple. The default implementation simply writes the traceback to
the request's ``wsgi.errors`` stream and flushes it. Subclasses can override
this method to change the format or retarget the output, mail the traceback to
an administrator, or whatever other action may be deemed suitable.
BaseHandler.traceback_limit~
The maximum number of frames to include in tracebacks output by the default
log_exception method. If ``None``, all frames are included.
BaseHandler.error_output(environ, start_response)~
This method is a WSGI application to generate an error page for the user. It is
only invoked if an error occurs before headers are sent to the client.
This method can access the current error information using ``sys.exc_info()``,
and should pass that information to {start_response} when calling it (as
described in the "Error Handling" section of 333).
The default implementation just uses the error_status,
error_headers, and error_body attributes to generate an output
page. Subclasses can override this to produce more dynamic error output.
Note, however, that it's not recommended from a security perspective to spit out
diagnostics to any old user; ideally, you should have to do something special to
enable diagnostic output, which is why the default implementation doesn't
include any.
BaseHandler.error_status~
The HTTP status used for error responses. This should be a status string as
defined in 333; it defaults to a 500 code and message.
BaseHandler.error_headers~
The HTTP headers used for error responses. This should be a list of WSGI
response headers (``(name, value)`` tuples), as described in 333. The
default list just sets the content type to ``text/plain``.
BaseHandler.error_body~
The error response body. This should be an HTTP response body string. It
defaults to the plain text, "A server error occurred. Please contact the
administrator."
Methods and attributes for 333's "Optional Platform-Specific File
Handling" feature:
BaseHandler.wsgi_file_wrapper~
A ``wsgi.file_wrapper`` factory, or ``None``. The default value of this
attribute is the FileWrapper class from wsgiref.util (|py2stdlib-wsgiref.util|).
BaseHandler.sendfile()~
Override to implement platform-specific file transmission. This method is
called only if the application's return value is an instance of the class
specified by the wsgi_file_wrapper attribute. It should return a true
value if it was able to successfully transmit the file, so that the default
transmission code will not be executed. The default implementation of this
method just returns a false value.
Miscellaneous methods and attributes:
BaseHandler.origin_server~
This attribute should be set to a true value if the handler's _write and
_flush are being used to communicate directly to the client, rather than
via a CGI-like gateway protocol that wants the HTTP status in a special
``Status:`` header.
This attribute's default value is true in BaseHandler, but false in
BaseCGIHandler and CGIHandler.
BaseHandler.http_version~
If origin_server is true, this string attribute is used to set the HTTP
version of the response set to the client. It defaults to ``"1.0"``.
Examples
--------
This is a working "Hello World" WSGI application:: >
from wsgiref.simple_server import make_server
# Every WSGI application must have an application object - a callable
# object that accepts two arguments. For that purpose, we're going to
# use a function (note that you're not limited to a function, you can
# use a class for example). The first argument passed to the function
# is a dictionary containing CGI-style envrironment variables and the
# second variable is the callable object (see 333)
def hello_world_app(environ, start_response):
status = '200 OK' # HTTP Status
headers = [('Content-type', 'text/plain')] # HTTP Headers
start_response(status, headers)
# The returned object is going to be printed
return ["Hello World"]
httpd = make_server('', 8000, hello_world_app)
print "Serving on port 8000..."
# Serve until process is killed
httpd.serve_forever()
==============================================================================
*py2stdlib-xml.parsers.expat*
xml.parsers.expat~
:synopsis: An interface to the Expat non-validating XML parser.
.. Markup notes:
Many of the attributes of the XMLParser objects are callbacks. Since
signature information must be presented, these are described using the method
directive. Since they are attributes which are set by client code, in-text
references to these attributes should be marked using the :member: role.
.. versionadded:: 2.0
.. index:: single: Expat
The xml.parsers.expat (|py2stdlib-xml.parsers.expat|) module is a Python interface to the Expat
non-validating XML parser. The module provides a single extension type,
xmlparser, that represents the current state of an XML parser. After
an xmlparser object has been created, various attributes of the object
can be set to handler functions. When an XML document is then fed to the
parser, the handler functions are called for the character data and markup in
the XML document.
.. index:: module: pyexpat
This module uses the pyexpat module to provide access to the Expat
parser. Direct use of the pyexpat module is deprecated.
This module provides one exception and one type object:
ExpatError~
The exception raised when Expat reports an error. See section
expaterror-objects for more information on interpreting Expat errors.
error~
Alias for ExpatError.
XMLParserType~
The type of the return values from the ParserCreate function.
The xml.parsers.expat (|py2stdlib-xml.parsers.expat|) module contains two functions:
ErrorString(errno)~
Returns an explanatory string for a given error number {errno}.
ParserCreate([encoding[, namespace_separator]])~
Creates and returns a new xmlparser object. {encoding}, if specified,
must be a string naming the encoding used by the XML data. Expat doesn't
support as many encodings as Python does, and its repertoire of encodings can't
be extended; it supports UTF-8, UTF-16, ISO-8859-1 (Latin1), and ASCII. If
{encoding} [1]_ is given it will override the implicit or explicit encoding of the
document.
Expat can optionally do XML namespace processing for you, enabled by providing a
value for {namespace_separator}. The value must be a one-character string; a
ValueError will be raised if the string has an illegal length (``None``
is considered the same as omission). When namespace processing is enabled,
element type names and attribute names that belong to a namespace will be
expanded. The element name passed to the element handlers
StartElementHandler and EndElementHandler will be the
concatenation of the namespace URI, the namespace separator character, and the
local part of the name. If the namespace separator is a zero byte (``chr(0)``)
then the namespace URI and the local part will be concatenated without any
separator.
For example, if {namespace_separator} is set to a space character (``' '``) and
the following document is parsed:: >
<?xml version="1.0"?>
<root xmlns = "http://default-namespace.org/"
xmlns:py = "http://www.python.org/ns/">
<py:elem1 />
<elem2 xmlns="" />
</root>
<
StartElementHandler will receive the following strings for each
element:: >
http://default-namespace.org/ root
http://www.python.org/ns/ elem1
elem2
<
.. seealso::
`The Expat XML Parser <http://www.libexpat.org/>`_
Home page of the Expat project.
XMLParser Objects
-----------------
xmlparser objects have the following methods:
xmlparser.Parse(data[, isfinal])~
Parses the contents of the string {data}, calling the appropriate handler
functions to process the parsed data. {isfinal} must be true on the final call
to this method. {data} can be the empty string at any time.
xmlparser.ParseFile(file)~
Parse XML data reading from the object {file}. {file} only needs to provide
the ``read(nbytes)`` method, returning the empty string when there's no more
data.
xmlparser.SetBase(base)~
Sets the base to be used for resolving relative URIs in system identifiers in
declarations. Resolving relative identifiers is left to the application: this
value will be passed through as the {base} argument to the
ExternalEntityRefHandler, NotationDeclHandler, and
UnparsedEntityDeclHandler functions.
xmlparser.GetBase()~
Returns a string containing the base set by a previous call to SetBase,
or ``None`` if SetBase hasn't been called.
xmlparser.GetInputContext()~
Returns the input data that generated the current event as a string. The data is
in the encoding of the entity which contains the text. When called while an
event handler is not active, the return value is ``None``.
.. versionadded:: 2.1
xmlparser.ExternalEntityParserCreate(context[, encoding])~
Create a "child" parser which can be used to parse an external parsed entity
referred to by content parsed by the parent parser. The {context} parameter
should be the string passed to the ExternalEntityRefHandler handler
function, described below. The child parser is created with the
ordered_attributes, returns_unicode and
specified_attributes set to the values of this parser.
xmlparser.UseForeignDTD([flag])~
Calling this with a true value for {flag} (the default) will cause Expat to call
the ExternalEntityRefHandler with None for all arguments to
allow an alternate DTD to be loaded. If the document does not contain a
document type declaration, the ExternalEntityRefHandler will still be
called, but the StartDoctypeDeclHandler and
EndDoctypeDeclHandler will not be called.
Passing a false value for {flag} will cancel a previous call that passed a true
value, but otherwise has no effect.
This method can only be called before the Parse or ParseFile
methods are called; calling it after either of those have been called causes
ExpatError to be raised with the code (|py2stdlib-code|) attribute set to
errors.XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING.
.. versionadded:: 2.3
xmlparser objects have the following attributes:
xmlparser.buffer_size~
The size of the buffer used when buffer_text is true.
A new buffer size can be set by assigning a new integer value
to this attribute.
When the size is changed, the buffer will be flushed.
.. versionadded:: 2.3
.. versionchanged:: 2.6
The buffer size can now be changed.
xmlparser.buffer_text~
Setting this to true causes the xmlparser object to buffer textual
content returned by Expat to avoid multiple calls to the
CharacterDataHandler callback whenever possible. This can improve
performance substantially since Expat normally breaks character data into chunks
at every line ending. This attribute is false by default, and may be changed at
any time.
.. versionadded:: 2.3
xmlparser.buffer_used~
If buffer_text is enabled, the number of bytes stored in the buffer.
These bytes represent UTF-8 encoded text. This attribute has no meaningful
interpretation when buffer_text is false.
.. versionadded:: 2.3
xmlparser.ordered_attributes~
Setting this attribute to a non-zero integer causes the attributes to be
reported as a list rather than a dictionary. The attributes are presented in
the order found in the document text. For each attribute, two list entries are
presented: the attribute name and the attribute value. (Older versions of this
module also used this format.) By default, this attribute is false; it may be
changed at any time.
.. versionadded:: 2.1
xmlparser.returns_unicode~
If this attribute is set to a non-zero integer, the handler functions will be
passed Unicode strings. If returns_unicode is False, 8-bit
strings containing UTF-8 encoded data will be passed to the handlers. This is
True by default when Python is built with Unicode support.
.. versionchanged:: 1.6
Can be changed at any time to affect the result type.
xmlparser.specified_attributes~
If set to a non-zero integer, the parser will report only those attributes which
were specified in the document instance and not those which were derived from
attribute declarations. Applications which set this need to be especially
careful to use what additional information is available from the declarations as
needed to comply with the standards for the behavior of XML processors. By
default, this attribute is false; it may be changed at any time.
.. versionadded:: 2.1
The following attributes contain values relating to the most recent error
encountered by an xmlparser object, and will only have correct values
once a call to Parse or ParseFile has raised a
xml.parsers.expat.ExpatError exception.
xmlparser.ErrorByteIndex~
Byte index at which an error occurred.
xmlparser.ErrorCode~
Numeric code specifying the problem. This value can be passed to the
ErrorString function, or compared to one of the constants defined in the
``errors`` object.
xmlparser.ErrorColumnNumber~
Column number at which an error occurred.
xmlparser.ErrorLineNumber~
Line number at which an error occurred.
The following attributes contain values relating to the current parse location
in an xmlparser object. During a callback reporting a parse event they
indicate the location of the first of the sequence of characters that generated
the event. When called outside of a callback, the position indicated will be
just past the last parse event (regardless of whether there was an associated
callback).
.. versionadded:: 2.4
xmlparser.CurrentByteIndex~
Current byte index in the parser input.
xmlparser.CurrentColumnNumber~
Current column number in the parser input.
xmlparser.CurrentLineNumber~
Current line number in the parser input.
Here is the list of handlers that can be set. To set a handler on an
xmlparser object {o}, use ``o.handlername = func``. {handlername} must
be taken from the following list, and {func} must be a callable object accepting
the correct number of arguments. The arguments are all strings, unless
otherwise stated.
xmlparser.XmlDeclHandler(version, encoding, standalone)~
Called when the XML declaration is parsed. The XML declaration is the
(optional) declaration of the applicable version of the XML recommendation, the
encoding of the document text, and an optional "standalone" declaration.
{version} and {encoding} will be strings of the type dictated by the
returns_unicode attribute, and {standalone} will be ``1`` if the
document is declared standalone, ``0`` if it is declared not to be standalone,
or ``-1`` if the standalone clause was omitted. This is only available with
Expat version 1.95.0 or newer.
.. versionadded:: 2.1
xmlparser.StartDoctypeDeclHandler(doctypeName, systemId, publicId, has_internal_subset)~
Called when Expat begins parsing the document type declaration (``<!DOCTYPE
...``). The {doctypeName} is provided exactly as presented. The {systemId} and
{publicId} parameters give the system and public identifiers if specified, or
``None`` if omitted. {has_internal_subset} will be true if the document
contains and internal document declaration subset. This requires Expat version
1.2 or newer.
xmlparser.EndDoctypeDeclHandler()~
Called when Expat is done parsing the document type declaration. This requires
Expat version 1.2 or newer.
xmlparser.ElementDeclHandler(name, model)~
Called once for each element type declaration. {name} is the name of the
element type, and {model} is a representation of the content model.
xmlparser.AttlistDeclHandler(elname, attname, type, default, required)~
Called for each declared attribute for an element type. If an attribute list
declaration declares three attributes, this handler is called three times, once
for each attribute. {elname} is the name of the element to which the
declaration applies and {attname} is the name of the attribute declared. The
attribute type is a string passed as {type}; the possible values are
``'CDATA'``, ``'ID'``, ``'IDREF'``, ... {default} gives the default value for
the attribute used when the attribute is not specified by the document instance,
or ``None`` if there is no default value (``#IMPLIED`` values). If the
attribute is required to be given in the document instance, {required} will be
true. This requires Expat version 1.95.0 or newer.
xmlparser.StartElementHandler(name, attributes)~
Called for the start of every element. {name} is a string containing the
element name, and {attributes} is a dictionary mapping attribute names to their
values.
xmlparser.EndElementHandler(name)~
Called for the end of every element.
xmlparser.ProcessingInstructionHandler(target, data)~
Called for every processing instruction.
xmlparser.CharacterDataHandler(data)~
Called for character data. This will be called for normal character data, CDATA
marked content, and ignorable whitespace. Applications which must distinguish
these cases can use the StartCdataSectionHandler,
EndCdataSectionHandler, and ElementDeclHandler callbacks to
collect the required information.
xmlparser.UnparsedEntityDeclHandler(entityName, base, systemId, publicId, notationName)~
Called for unparsed (NDATA) entity declarations. This is only present for
version 1.2 of the Expat library; for more recent versions, use
EntityDeclHandler instead. (The underlying function in the Expat
library has been declared obsolete.)
xmlparser.EntityDeclHandler(entityName, is_parameter_entity, value, base, systemId, publicId, notationName)~
Called for all entity declarations. For parameter and internal entities,
{value} will be a string giving the declared contents of the entity; this will
be ``None`` for external entities. The {notationName} parameter will be
``None`` for parsed entities, and the name of the notation for unparsed
entities. {is_parameter_entity} will be true if the entity is a parameter entity
or false for general entities (most applications only need to be concerned with
general entities). This is only available starting with version 1.95.0 of the
Expat library.
.. versionadded:: 2.1
xmlparser.NotationDeclHandler(notationName, base, systemId, publicId)~
Called for notation declarations. {notationName}, {base}, and {systemId}, and
{publicId} are strings if given. If the public identifier is omitted,
{publicId} will be ``None``.
xmlparser.StartNamespaceDeclHandler(prefix, uri)~
Called when an element contains a namespace declaration. Namespace declarations
are processed before the StartElementHandler is called for the element
on which declarations are placed.
xmlparser.EndNamespaceDeclHandler(prefix)~
Called when the closing tag is reached for an element that contained a
namespace declaration. This is called once for each namespace declaration on
the element in the reverse of the order for which the
StartNamespaceDeclHandler was called to indicate the start of each
namespace declaration's scope. Calls to this handler are made after the
corresponding EndElementHandler for the end of the element.
xmlparser.CommentHandler(data)~
Called for comments. {data} is the text of the comment, excluding the leading
'``<!-``\ ``-``' and trailing '``-``\ ``->``'.
xmlparser.StartCdataSectionHandler()~
Called at the start of a CDATA section. This and EndCdataSectionHandler
are needed to be able to identify the syntactical start and end for CDATA
sections.
xmlparser.EndCdataSectionHandler()~
Called at the end of a CDATA section.
xmlparser.DefaultHandler(data)~
Called for any characters in the XML document for which no applicable handler
has been specified. This means characters that are part of a construct which
could be reported, but for which no handler has been supplied.
xmlparser.DefaultHandlerExpand(data)~
This is the same as the DefaultHandler, but doesn't inhibit expansion
of internal entities. The entity reference will not be passed to the default
handler.
xmlparser.NotStandaloneHandler()~
Called if the XML document hasn't been declared as being a standalone document.
This happens when there is an external subset or a reference to a parameter
entity, but the XML declaration does not set standalone to ``yes`` in an XML
declaration. If this handler returns ``0``, then the parser will throw an
XML_ERROR_NOT_STANDALONE error. If this handler is not set, no
exception is raised by the parser for this condition.
xmlparser.ExternalEntityRefHandler(context, base, systemId, publicId)~
Called for references to external entities. {base} is the current base, as set
by a previous call to SetBase. The public and system identifiers,
{systemId} and {publicId}, are strings if given; if the public identifier is not
given, {publicId} will be ``None``. The {context} value is opaque and should
only be used as described below.
For external entities to be parsed, this handler must be implemented. It is
responsible for creating the sub-parser using
``ExternalEntityParserCreate(context)``, initializing it with the appropriate
callbacks, and parsing the entity. This handler should return an integer; if it
returns ``0``, the parser will throw an
XML_ERROR_EXTERNAL_ENTITY_HANDLING error, otherwise parsing will
continue.
If this handler is not provided, external entities are reported by the
DefaultHandler callback, if provided.
ExpatError Exceptions
---------------------
ExpatError exceptions have a number of interesting attributes:
ExpatError.code~
Expat's internal error number for the specific error. This will match one of
the constants defined in the ``errors`` object from this module.
.. versionadded:: 2.1
ExpatError.lineno~
Line number on which the error was detected. The first line is numbered ``1``.
.. versionadded:: 2.1
ExpatError.offset~
Character offset into the line where the error occurred. The first column is
numbered ``0``.
.. versionadded:: 2.1
Example
-------
The following program defines three handlers that just print out their
arguments. :: >
import xml.parsers.expat
# 3 handler functions
def start_element(name, attrs):
print 'Start element:', name, attrs
def end_element(name):
print 'End element:', name
def char_data(data):
print 'Character data:', repr(data)
p = xml.parsers.expat.ParserCreate()
p.StartElementHandler = start_element
p.EndElementHandler = end_element
p.CharacterDataHandler = char_data
p.Parse("""<?xml version="1.0"?>
<parent id="top"><child1 name="paul">Text goes here</child1>
<child2 name="fred">More text</child2>
</parent>""", 1)
<
The output from this program is::
Start element: parent {'id': 'top'}
Start element: child1 {'name': 'paul'}
Character data: 'Text goes here'
End element: child1
Character data: '\n'
Start element: child2 {'name': 'fred'}
Character data: 'More text'
End element: child2
Character data: '\n'
End element: parent
Content Model Descriptions
--------------------------
Content modules are described using nested tuples. Each tuple contains four
values: the type, the quantifier, the name, and a tuple of children. Children
are simply additional content module descriptions.
The values of the first two fields are constants defined in the ``model`` object
of the xml.parsers.expat (|py2stdlib-xml.parsers.expat|) module. These constants can be collected in two
groups: the model type group and the quantifier group.
The constants in the model type group are:
XML_CTYPE_ANY~
The element named by the model name was declared to have a content model of
``ANY``.
XML_CTYPE_CHOICE~
The named element allows a choice from a number of options; this is used for
content models such as ``(A | B | C)``.
XML_CTYPE_EMPTY~
Elements which are declared to be ``EMPTY`` have this model type.
XML_CTYPE_MIXED~
XML_CTYPE_NAME~
XML_CTYPE_SEQ~
Models which represent a series of models which follow one after the other are
indicated with this model type. This is used for models such as ``(A, B, C)``.
The constants in the quantifier group are:
XML_CQUANT_NONE~
No modifier is given, so it can appear exactly once, as for ``A``.
XML_CQUANT_OPT~
The model is optional: it can appear once or not at all, as for ``A?``.
XML_CQUANT_PLUS~
The model must occur one or more times (like ``A+``).
XML_CQUANT_REP~
The model must occur zero or more times, as for ``A*``.
Expat error constants
---------------------
The following constants are provided in the ``errors`` object of the
xml.parsers.expat (|py2stdlib-xml.parsers.expat|) module. These constants are useful in interpreting
some of the attributes of the ExpatError exception objects raised when an
error has occurred.
The ``errors`` object has the following attributes:
XML_ERROR_ASYNC_ENTITY~
XML_ERROR_ATTRIBUTE_EXTERNAL_ENTITY_REF~
An entity reference in an attribute value referred to an external entity instead
of an internal entity.
XML_ERROR_BAD_CHAR_REF~
A character reference referred to a character which is illegal in XML (for
example, character ``0``, or '``&#0;``').
XML_ERROR_BINARY_ENTITY_REF~
An entity reference referred to an entity which was declared with a notation, so
cannot be parsed.
XML_ERROR_DUPLICATE_ATTRIBUTE~
An attribute was used more than once in a start tag.
XML_ERROR_INCORRECT_ENCODING~
XML_ERROR_INVALID_TOKEN~
Raised when an input byte could not properly be assigned to a character; for
example, a NUL byte (value ``0``) in a UTF-8 input stream.
XML_ERROR_JUNK_AFTER_DOC_ELEMENT~
Something other than whitespace occurred after the document element.
XML_ERROR_MISPLACED_XML_PI~
An XML declaration was found somewhere other than the start of the input data.
XML_ERROR_NO_ELEMENTS~
The document contains no elements (XML requires all documents to contain exactly
one top-level element)..
XML_ERROR_NO_MEMORY~
Expat was not able to allocate memory internally.
XML_ERROR_PARAM_ENTITY_REF~
A parameter entity reference was found where it was not allowed.
XML_ERROR_PARTIAL_CHAR~
An incomplete character was found in the input.
XML_ERROR_RECURSIVE_ENTITY_REF~
An entity reference contained another reference to the same entity; possibly via
a different name, and possibly indirectly.
XML_ERROR_SYNTAX~
Some unspecified syntax error was encountered.
XML_ERROR_TAG_MISMATCH~
An end tag did not match the innermost open start tag.
XML_ERROR_UNCLOSED_TOKEN~
Some token (such as a start tag) was not closed before the end of the stream or
the next token was encountered.
XML_ERROR_UNDEFINED_ENTITY~
A reference was made to a entity which was not defined.
XML_ERROR_UNKNOWN_ENCODING~
The document encoding is not supported by Expat.
XML_ERROR_UNCLOSED_CDATA_SECTION~
A CDATA marked section was not closed.
XML_ERROR_EXTERNAL_ENTITY_HANDLING~
XML_ERROR_NOT_STANDALONE~
The parser determined that the document was not "standalone" though it declared
itself to be in the XML declaration, and the NotStandaloneHandler was
set and returned ``0``.
XML_ERROR_UNEXPECTED_STATE~
XML_ERROR_ENTITY_DECLARED_IN_PE~
XML_ERROR_FEATURE_REQUIRES_XML_DTD~
An operation was requested that requires DTD support to be compiled in, but
Expat was configured without DTD support. This should never be reported by a
standard build of the xml.parsers.expat (|py2stdlib-xml.parsers.expat|) module.
XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING~
A behavioral change was requested after parsing started that can only be changed
before parsing has started. This is (currently) only raised by
UseForeignDTD.
XML_ERROR_UNBOUND_PREFIX~
An undeclared prefix was found when namespace processing was enabled.
XML_ERROR_UNDECLARING_PREFIX~
The document attempted to remove the namespace declaration associated with a
prefix.
XML_ERROR_INCOMPLETE_PE~
A parameter entity contained incomplete markup.
XML_ERROR_XML_DECL~
The document contained no document element at all.
XML_ERROR_TEXT_DECL~
There was an error parsing a text declaration in an external entity.
XML_ERROR_PUBLICID~
Characters were found in the public id that are not allowed.
XML_ERROR_SUSPENDED~
The requested operation was made on a suspended parser, but isn't allowed. This
includes attempts to provide additional input or to stop the parser.
XML_ERROR_NOT_SUSPENDED~
An attempt to resume the parser was made when the parser had not been suspended.
XML_ERROR_ABORTED~
This should not be reported to Python applications.
XML_ERROR_FINISHED~
The requested operation was made on a parser which was finished parsing input,
but isn't allowed. This includes attempts to provide additional input or to
stop the parser.
XML_ERROR_SUSPEND_PE~
.. rubric:: Footnotes
.. [#] The encoding string included in XML output should conform to the
appropriate standards. For example, "UTF-8" is valid, but "UTF8" is
not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
and http://www.iana.org/assignments/character-sets .
==============================================================================
*py2stdlib-xdrlib*
xdrlib~
:synopsis: Encoders and decoders for the External Data Representation (XDR).
.. index::
single: XDR
single: External Data Representation
The xdrlib (|py2stdlib-xdrlib|) module supports the External Data Representation Standard as
described in 1014, written by Sun Microsystems, Inc. June 1987. It
supports most of the data types described in the RFC.
The xdrlib (|py2stdlib-xdrlib|) module defines two classes, one for packing variables into XDR
representation, and another for unpacking from XDR representation. There are
also two exception classes.
Packer()~
Packer is the class for packing data into XDR representation. The
Packer class is instantiated with no arguments.
Unpacker(data)~
``Unpacker`` is the complementary class which unpacks XDR data values from a
string buffer. The input buffer is given as {data}.
.. seealso::
1014 - XDR: External Data Representation Standard
This RFC defined the encoding of data which was XDR at the time this module was
originally written. It has apparently been obsoleted by 1832.
1832 - XDR: External Data Representation Standard
Newer RFC that provides a revised definition of XDR.
Packer Objects
--------------
Packer instances have the following methods:
Packer.get_buffer()~
Returns the current pack buffer as a string.
Packer.reset()~
Resets the pack buffer to the empty string.
In general, you can pack any of the most common XDR data types by calling the
appropriate ``pack_type()`` method. Each method takes a single argument, the
value to pack. The following simple data type packing methods are supported:
pack_uint, pack_int, pack_enum, pack_bool,
pack_uhyper, and pack_hyper.
Packer.pack_float(value)~
Packs the single-precision floating point number {value}.
Packer.pack_double(value)~
Packs the double-precision floating point number {value}.
The following methods support packing strings, bytes, and opaque data:
Packer.pack_fstring(n, s)~
Packs a fixed length string, {s}. {n} is the length of the string but it is
{not} packed into the data buffer. The string is padded with null bytes if
necessary to guaranteed 4 byte alignment.
Packer.pack_fopaque(n, data)~
Packs a fixed length opaque data stream, similarly to pack_fstring.
Packer.pack_string(s)~
Packs a variable length string, {s}. The length of the string is first packed
as an unsigned integer, then the string data is packed with
pack_fstring.
Packer.pack_opaque(data)~
Packs a variable length opaque data string, similarly to pack_string.
Packer.pack_bytes(bytes)~
Packs a variable length byte stream, similarly to pack_string.
The following methods support packing arrays and lists:
Packer.pack_list(list, pack_item)~
Packs a {list} of homogeneous items. This method is useful for lists with an
indeterminate size; i.e. the size is not available until the entire list has
been walked. For each item in the list, an unsigned integer ``1`` is packed
first, followed by the data value from the list. {pack_item} is the function
that is called to pack the individual item. At the end of the list, an unsigned
integer ``0`` is packed.
For example, to pack a list of integers, the code might appear like this:: >
import xdrlib
p = xdrlib.Packer()
p.pack_list([1, 2, 3], p.pack_int)
<
Packer.pack_farray(n, array, pack_item)~
Packs a fixed length list ({array}) of homogeneous items. {n} is the length of
the list; it is {not} packed into the buffer, but a ValueError exception
is raised if ``len(array)`` is not equal to {n}. As above, {pack_item} is the
function used to pack each element.
Packer.pack_array(list, pack_item)~
Packs a variable length {list} of homogeneous items. First, the length of the
list is packed as an unsigned integer, then each element is packed as in
pack_farray above.
Unpacker Objects
----------------
The Unpacker class offers the following methods:
Unpacker.reset(data)~
Resets the string buffer with the given {data}.
Unpacker.get_position()~
Returns the current unpack position in the data buffer.
Unpacker.set_position(position)~
Sets the data buffer unpack position to {position}. You should be careful about
using get_position and set_position.
Unpacker.get_buffer()~
Returns the current unpack data buffer as a string.
Unpacker.done()~
Indicates unpack completion. Raises an Error exception if all of the
data has not been unpacked.
In addition, every data type that can be packed with a Packer, can be
unpacked with an Unpacker. Unpacking methods are of the form
``unpack_type()``, and take no arguments. They return the unpacked object.
Unpacker.unpack_float()~
Unpacks a single-precision floating point number.
Unpacker.unpack_double()~
Unpacks a double-precision floating point number, similarly to
unpack_float.
In addition, the following methods unpack strings, bytes, and opaque data:
Unpacker.unpack_fstring(n)~
Unpacks and returns a fixed length string. {n} is the number of characters
expected. Padding with null bytes to guaranteed 4 byte alignment is assumed.
Unpacker.unpack_fopaque(n)~
Unpacks and returns a fixed length opaque data stream, similarly to
unpack_fstring.
Unpacker.unpack_string()~
Unpacks and returns a variable length string. The length of the string is first
unpacked as an unsigned integer, then the string data is unpacked with
unpack_fstring.
Unpacker.unpack_opaque()~
Unpacks and returns a variable length opaque data string, similarly to
unpack_string.
Unpacker.unpack_bytes()~
Unpacks and returns a variable length byte stream, similarly to
unpack_string.
The following methods support unpacking arrays and lists:
Unpacker.unpack_list(unpack_item)~
Unpacks and returns a list of homogeneous items. The list is unpacked one
element at a time by first unpacking an unsigned integer flag. If the flag is
``1``, then the item is unpacked and appended to the list. A flag of ``0``
indicates the end of the list. {unpack_item} is the function that is called to
unpack the items.
Unpacker.unpack_farray(n, unpack_item)~
Unpacks and returns (as a list) a fixed length array of homogeneous items. {n}
is number of list elements to expect in the buffer. As above, {unpack_item} is
the function used to unpack each element.
Unpacker.unpack_array(unpack_item)~
Unpacks and returns a variable length {list} of homogeneous items. First, the
length of the list is unpacked as an unsigned integer, then each element is
unpacked as in unpack_farray above.
Exceptions
----------
Exceptions in this module are coded as class instances:
Error~
The base exception class. Error has a single public data member
msg containing the description of the error.
ConversionError~
Class derived from Error. Contains no additional instance variables.
Here is an example of how you would catch one of these exceptions:: >
import xdrlib
p = xdrlib.Packer()
try:
p.pack_double(8.01)
except xdrlib.ConversionError, instance:
print 'packing the double failed:', instance.msg
==============================================================================
*py2stdlib-xml.dom.minidom*
xml.dom.minidom~
:synopsis: Lightweight Document Object Model (DOM) implementation.
.. versionadded:: 2.0
xml.dom.minidom (|py2stdlib-xml.dom.minidom|) is a light-weight implementation of the Document Object
Model interface. It is intended to be simpler than the full DOM and also
significantly smaller.
DOM applications typically start by parsing some XML into a DOM. With
xml.dom.minidom (|py2stdlib-xml.dom.minidom|), this is done through the parse functions:: >
from xml.dom.minidom import parse, parseString
dom1 = parse('c:\\temp\\mydata.xml') # parse an XML file by name
datasource = open('c:\\temp\\mydata.xml')
dom2 = parse(datasource) # parse an open file
dom3 = parseString('<myxml>Some data<empty/> some more data</myxml>')
<
The parse function can take either a filename or an open file object.
parse(filename_or_file[, parser[, bufsize]])~
Return a Document from the given input. {filename_or_file} may be
either a file name, or a file-like object. {parser}, if given, must be a SAX2
parser object. This function will change the document handler of the parser and
activate namespace support; other parser configuration (like setting an entity
resolver) must have been done in advance.
If you have XML in a string, you can use the parseString function
instead:
parseString(string[, parser])~
Return a Document that represents the {string}. This method creates a
StringIO (|py2stdlib-stringio|) object for the string and passes that on to parse.
Both functions return a Document object representing the content of the
document.
What the parse and parseString functions do is connect an XML
parser with a "DOM builder" that can accept parse events from any SAX parser and
convert them into a DOM tree. The name of the functions are perhaps misleading,
but are easy to grasp when learning the interfaces. The parsing of the document
will be completed before these functions return; it's simply that these
functions do not provide a parser implementation themselves.
You can also create a Document by calling a method on a "DOM
Implementation" object. You can get this object either by calling the
getDOMImplementation function in the xml.dom (|py2stdlib-xml.dom|) package or the
xml.dom.minidom (|py2stdlib-xml.dom.minidom|) module. Using the implementation from the
xml.dom.minidom (|py2stdlib-xml.dom.minidom|) module will always return a Document instance
from the minidom implementation, while the version from xml.dom (|py2stdlib-xml.dom|) may
provide an alternate implementation (this is likely if you have the `PyXML
package <http://pyxml.sourceforge.net/>`_ installed). Once you have a
Document, you can add child nodes to it to populate the DOM:: >
from xml.dom.minidom import getDOMImplementation
impl = getDOMImplementation()
newdoc = impl.createDocument(None, "some_tag", None)
top_element = newdoc.documentElement
text = newdoc.createTextNode('Some textual content.')
top_element.appendChild(text)
<
Once you have a DOM document object, you can access the parts of your XML
document through its properties and methods. These properties are defined in
the DOM specification. The main property of the document object is the
documentElement property. It gives you the main element in the XML
document: the one that holds all others. Here is an example program:: >
dom3 = parseString("<myxml>Some data</myxml>")
assert dom3.documentElement.tagName == "myxml"
<
When you are finished with a DOM tree, you may optionally call the
unlink method to encourage early cleanup of the now-unneeded
objects. unlink is a xml.dom.minidom (|py2stdlib-xml.dom.minidom|)\ -specific
extension to the DOM API that renders the node and its descendants are
essentially useless. Otherwise, Python's garbage collector will
eventually take care of the objects in the tree.
.. seealso::
`Document Object Model (DOM) Level 1 Specification <http://www.w3.org/TR/REC-DOM-Level-1/>`_
The W3C recommendation for the DOM supported by xml.dom.minidom (|py2stdlib-xml.dom.minidom|).
DOM Objects
-----------
The definition of the DOM API for Python is given as part of the xml.dom (|py2stdlib-xml.dom|)
module documentation. This section lists the differences between the API and
xml.dom.minidom (|py2stdlib-xml.dom.minidom|).
Node.unlink()~
Break internal references within the DOM so that it will be garbage collected on
versions of Python without cyclic GC. Even when cyclic GC is available, using
this can make large amounts of memory available sooner, so calling this on DOM
objects as soon as they are no longer needed is good practice. This only needs
to be called on the Document object, but may be called on child nodes
to discard children of that node.
Node.writexml(writer[, indent=""[, addindent=""[, newl=""[, encoding=""]]]])~
Write XML to the writer object. The writer should have a write method
which matches that of the file object interface. The {indent} parameter is the
indentation of the current node. The {addindent} parameter is the incremental
indentation to use for subnodes of the current one. The {newl} parameter
specifies the string to use to terminate newlines.
.. versionchanged:: 2.1
The optional keyword parameters {indent}, {addindent}, and {newl} were added to
support pretty output.
.. versionchanged:: 2.3
For the Document node, an additional keyword argument
{encoding} can be used to specify the encoding field of the XML header.
Node.toxml([encoding])~
Return the XML that the DOM represents as a string.
With no argument, the XML header does not specify an encoding, and the result is
Unicode string if the default encoding cannot represent all characters in the
document. Encoding this string in an encoding other than UTF-8 is likely
incorrect, since UTF-8 is the default encoding of XML.
With an explicit {encoding} [1]_ argument, the result is a byte string in the
specified encoding. It is recommended that this argument is always specified. To
avoid UnicodeError exceptions in case of unrepresentable text data, the
encoding argument should be specified as "utf-8".
.. versionchanged:: 2.3
the {encoding} argument was introduced; see writexml.
Node.toprettyxml([indent=""[, newl=""[, encoding=""]]])~
Return a pretty-printed version of the document. {indent} specifies the
indentation string and defaults to a tabulator; {newl} specifies the string
emitted at the end of each line and defaults to ``\n``.
.. versionadded:: 2.1
.. versionchanged:: 2.3
the encoding argument was introduced; see writexml.
The following standard DOM methods have special considerations with
xml.dom.minidom (|py2stdlib-xml.dom.minidom|):
Node.cloneNode(deep)~
Although this method was present in the version of xml.dom.minidom (|py2stdlib-xml.dom.minidom|)
packaged with Python 2.0, it was seriously broken. This has been corrected for
subsequent releases.
DOM Example
-----------
This example program is a fairly realistic example of a simple program. In this
particular case, we do not take much advantage of the flexibility of the DOM.
.. literalinclude:: ../includes/minidom-example.py
minidom and the DOM standard
----------------------------
The xml.dom.minidom (|py2stdlib-xml.dom.minidom|) module is essentially a DOM 1.0-compatible DOM with
some DOM 2 features (primarily namespace features).
Usage of the DOM interface in Python is straight-forward. The following mapping
rules apply:
* Interfaces are accessed through instance objects. Applications should not
instantiate the classes themselves; they should use the creator functions
available on the Document object. Derived interfaces support all
operations (and attributes) from the base interfaces, plus any new operations.
* Operations are used as methods. Since the DOM uses only in
parameters, the arguments are passed in normal order (from left to right).
There are no optional arguments. ``void`` operations return ``None``.
* IDL attributes map to instance attributes. For compatibility with the OMG IDL
language mapping for Python, an attribute ``foo`` can also be accessed through
accessor methods _get_foo and _set_foo. ``readonly``
attributes must not be changed; this is not enforced at runtime.
* The types ``short int``, ``unsigned int``, ``unsigned long long``, and
``boolean`` all map to Python integer objects.
* The type ``DOMString`` maps to Python strings. xml.dom.minidom (|py2stdlib-xml.dom.minidom|) supports
either byte or Unicode strings, but will normally produce Unicode strings.
Values of type ``DOMString`` may also be ``None`` where allowed to have the IDL
``null`` value by the DOM specification from the W3C.
* ``const`` declarations map to variables in their respective scope (e.g.
``xml.dom.minidom.Node.PROCESSING_INSTRUCTION_NODE``); they must not be changed.
* ``DOMException`` is currently not supported in xml.dom.minidom (|py2stdlib-xml.dom.minidom|).
Instead, xml.dom.minidom (|py2stdlib-xml.dom.minidom|) uses standard Python exceptions such as
TypeError and AttributeError.
* NodeList objects are implemented using Python's built-in list type.
Starting with Python 2.2, these objects provide the interface defined in the DOM
specification, but with earlier versions of Python they do not support the
official API. They are, however, much more "Pythonic" than the interface
defined in the W3C recommendations.
The following interfaces have no implementation in xml.dom.minidom (|py2stdlib-xml.dom.minidom|):
* DOMTimeStamp
* DocumentType (added in Python 2.1)
* DOMImplementation (added in Python 2.1)
* CharacterData
* CDATASection
* Notation
* Entity
* EntityReference
* DocumentFragment
Most of these reflect information in the XML document that is not of general
utility to most DOM users.
.. rubric:: Footnotes
.. [#] The encoding string included in XML output should conform to the
appropriate standards. For example, "UTF-8" is valid, but "UTF8" is
not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
and http://www.iana.org/assignments/character-sets .
==============================================================================
*py2stdlib-xml.dom.pulldom*
xml.dom.pulldom~
:synopsis: Support for building partial DOM trees from SAX events.
.. versionadded:: 2.0
xml.dom.pulldom (|py2stdlib-xml.dom.pulldom|) allows building only selected portions of a Document
Object Model representation of a document from SAX events.
PullDOM([documentFactory])~
xml.sax.handler.ContentHandler implementation that ...
DOMEventStream(stream, parser, bufsize)~
...
SAX2DOM([documentFactory])~
xml.sax.handler.ContentHandler implementation that ...
parse(stream_or_string[, parser[, bufsize]])~
...
parseString(string[, parser])~
...
default_bufsize~
Default value for the {bufsize} parameter to parse.
.. versionchanged:: 2.1
The value of this variable can be changed before calling parse and the
new value will take effect.
DOMEventStream Objects
----------------------
DOMEventStream.getEvent()~
...
DOMEventStream.expandNode(node)~
...
DOMEventStream.reset()~
...
==============================================================================
*py2stdlib-xml.dom*
xml.dom~
:synopsis: Document Object Model API for Python.
.. versionadded:: 2.0
The Document Object Model, or "DOM," is a cross-language API from the World Wide
Web Consortium (W3C) for accessing and modifying XML documents. A DOM
implementation presents an XML document as a tree structure, or allows client
code to build such a structure from scratch. It then gives access to the
structure through a set of objects which provided well-known interfaces.
The DOM is extremely useful for random-access applications. SAX only allows you
a view of one bit of the document at a time. If you are looking at one SAX
element, you have no access to another. If you are looking at a text node, you
have no access to a containing element. When you write a SAX application, you
need to keep track of your program's position in the document somewhere in your
own code. SAX does not do it for you. Also, if you need to look ahead in the
XML document, you are just out of luck.
Some applications are simply impossible in an event driven model with no access
to a tree. Of course you could build some sort of tree yourself in SAX events,
but the DOM allows you to avoid writing that code. The DOM is a standard tree
representation for XML data.
The Document Object Model is being defined by the W3C in stages, or "levels" in
their terminology. The Python mapping of the API is substantially based on the
DOM Level 2 recommendation.
.. XXX PyXML is dead...
.. The mapping of the Level 3 specification, currently
only available in draft form, is being developed by the `Python XML Special
Interest Group <http://www.python.org/sigs/xml-sig/>`_ as part of the `PyXML
package <http://pyxml.sourceforge.net/>`_. Refer to the documentation bundled
with that package for information on the current state of DOM Level 3 support.
.. What if your needs are somewhere between SAX and the DOM? Perhaps
you cannot afford to load the entire tree in memory but you find the
SAX model somewhat cumbersome and low-level. There is also a module
called xml.dom.pulldom that allows you to build trees of only the
parts of a document that you need structured access to. It also has
features that allow you to find your way around the DOM.
See http://www.prescod.net/python/pulldom
DOM applications typically start by parsing some XML into a DOM. How this is
accomplished is not covered at all by DOM Level 1, and Level 2 provides only
limited improvements: There is a DOMImplementation object class which
provides access to Document creation methods, but no way to access an
XML reader/parser/Document builder in an implementation-independent way. There
is also no well-defined way to access these methods without an existing
Document object. In Python, each DOM implementation will provide a
function getDOMImplementation. DOM Level 3 adds a Load/Store
specification, which defines an interface to the reader, but this is not yet
available in the Python standard library.
Once you have a DOM document object, you can access the parts of your XML
document through its properties and methods. These properties are defined in
the DOM specification; this portion of the reference manual describes the
interpretation of the specification in Python.
The specification provided by the W3C defines the DOM API for Java, ECMAScript,
and OMG IDL. The Python mapping defined here is based in large part on the IDL
version of the specification, but strict compliance is not required (though
implementations are free to support the strict mapping from IDL). See section
dom-conformance for a detailed discussion of mapping requirements.
.. seealso::
`Document Object Model (DOM) Level 2 Specification <http://www.w3.org/TR/DOM-Level-2-Core/>`_
The W3C recommendation upon which the Python DOM API is based.
`Document Object Model (DOM) Level 1 Specification <http://www.w3.org/TR/REC-DOM-Level-1/>`_
The W3C recommendation for the DOM supported by xml.dom.minidom (|py2stdlib-xml.dom.minidom|).
`Python Language Mapping Specification <http://www.omg.org/spec/PYTH/1.2/PDF>`_
This specifies the mapping from OMG IDL to Python.
Module Contents
---------------
The xml.dom (|py2stdlib-xml.dom|) contains the following functions:
registerDOMImplementation(name, factory)~
Register the {factory} function with the name {name}. The factory function
should return an object which implements the DOMImplementation
interface. The factory function can return the same object every time, or a new
one for each call, as appropriate for the specific implementation (e.g. if that
implementation supports some customization).
getDOMImplementation([name[, features]])~
Return a suitable DOM implementation. The {name} is either well-known, the
module name of a DOM implementation, or ``None``. If it is not ``None``, imports
the corresponding module and returns a DOMImplementation object if the
import succeeds. If no name is given, and if the environment variable
PYTHON_DOM is set, this variable is used to find the implementation.
If name is not given, this examines the available implementations to find one
with the required feature set. If no implementation can be found, raise an
ImportError. The features list must be a sequence of ``(feature,
version)`` pairs which are passed to the hasFeature method on available
DOMImplementation objects.
Some convenience constants are also provided:
EMPTY_NAMESPACE~
The value used to indicate that no namespace is associated with a node in the
DOM. This is typically found as the namespaceURI of a node, or used as
the {namespaceURI} parameter to a namespaces-specific method.
.. versionadded:: 2.2
XML_NAMESPACE~
The namespace URI associated with the reserved prefix ``xml``, as defined by
`Namespaces in XML <http://www.w3.org/TR/REC-xml-names/>`_ (section 4).
.. versionadded:: 2.2
XMLNS_NAMESPACE~
The namespace URI for namespace declarations, as defined by `Document Object
Model (DOM) Level 2 Core Specification
<http://www.w3.org/TR/DOM-Level-2-Core/core.html>`_ (section 1.1.8).
.. versionadded:: 2.2
XHTML_NAMESPACE~
The URI of the XHTML namespace as defined by `XHTML 1.0: The Extensible
HyperText Markup Language <http://www.w3.org/TR/xhtml1/>`_ (section 3.1.1).
.. versionadded:: 2.2
In addition, xml.dom (|py2stdlib-xml.dom|) contains a base Node class and the DOM
exception classes. The Node class provided by this module does not
implement any of the methods or attributes defined by the DOM specification;
concrete DOM implementations must provide those. The Node class
provided as part of this module does provide the constants used for the
nodeType attribute on concrete Node objects; they are located
within the class rather than at the module level to conform with the DOM
specifications.
.. Should the Node documentation go here?
Objects in the DOM
------------------
The definitive documentation for the DOM is the DOM specification from the W3C.
Note that DOM attributes may also be manipulated as nodes instead of as simple
strings. It is fairly rare that you must do this, however, so this usage is not
yet documented.
+--------------------------------+-----------------------------------+---------------------------------+
| Interface | Section | Purpose |
+================================+===================================+=================================+
| DOMImplementation | dom-implementation-objects | Interface to the underlying |
| | | implementation. |
+--------------------------------+-----------------------------------+---------------------------------+
| Node | dom-node-objects | Base interface for most objects |
| | | in a document. |
+--------------------------------+-----------------------------------+---------------------------------+
| NodeList | dom-nodelist-objects | Interface for a sequence of |
| | | nodes. |
+--------------------------------+-----------------------------------+---------------------------------+
| DocumentType | dom-documenttype-objects | Information about the |
| | | declarations needed to process |
| | | a document. |
+--------------------------------+-----------------------------------+---------------------------------+
| Document | dom-document-objects | Object which represents an |
| | | entire document. |
+--------------------------------+-----------------------------------+---------------------------------+
| Element | dom-element-objects | Element nodes in the document |
| | | hierarchy. |
+--------------------------------+-----------------------------------+---------------------------------+
| Attr | dom-attr-objects | Attribute value nodes on |
| | | element nodes. |
+--------------------------------+-----------------------------------+---------------------------------+
| Comment | dom-comment-objects | Representation of comments in |
| | | the source document. |
+--------------------------------+-----------------------------------+---------------------------------+
| Text | dom-text-objects | Nodes containing textual |
| | | content from the document. |
+--------------------------------+-----------------------------------+---------------------------------+
| ProcessingInstruction | dom-pi-objects | Processing instruction |
| | | representation. |
+--------------------------------+-----------------------------------+---------------------------------+
An additional section describes the exceptions defined for working with the DOM
in Python.
DOMImplementation Objects
^^^^^^^^^^^^^^^^^^^^^^^^^
The DOMImplementation interface provides a way for applications to
determine the availability of particular features in the DOM they are using.
DOM Level 2 added the ability to create new Document and
DocumentType objects using the DOMImplementation as well.
DOMImplementation.hasFeature(feature, version)~
Return true if the feature identified by the pair of strings {feature} and
{version} is implemented.
DOMImplementation.createDocument(namespaceUri, qualifiedName, doctype)~
Return a new Document object (the root of the DOM), with a child
Element object having the given {namespaceUri} and {qualifiedName}. The
{doctype} must be a DocumentType object created by
createDocumentType, or ``None``. In the Python DOM API, the first two
arguments can also be ``None`` in order to indicate that no Element
child is to be created.
DOMImplementation.createDocumentType(qualifiedName, publicId, systemId)~
Return a new DocumentType object that encapsulates the given
{qualifiedName}, {publicId}, and {systemId} strings, representing the
information contained in an XML document type declaration.
Node Objects
^^^^^^^^^^^^
All of the components of an XML document are subclasses of Node.
Node.nodeType~
An integer representing the node type. Symbolic constants for the types are on
the Node object: ELEMENT_NODE, ATTRIBUTE_NODE,
TEXT_NODE, CDATA_SECTION_NODE, ENTITY_NODE,
PROCESSING_INSTRUCTION_NODE, COMMENT_NODE,
DOCUMENT_NODE, DOCUMENT_TYPE_NODE, NOTATION_NODE.
This is a read-only attribute.
Node.parentNode~
The parent of the current node, or ``None`` for the document node. The value is
always a Node object or ``None``. For Element nodes, this
will be the parent element, except for the root element, in which case it will
be the Document object. For Attr nodes, this is always
``None``. This is a read-only attribute.
Node.attributes~
A NamedNodeMap of attribute objects. Only elements have actual values
for this; others provide ``None`` for this attribute. This is a read-only
attribute.
Node.previousSibling~
The node that immediately precedes this one with the same parent. For
instance the element with an end-tag that comes just before the {self}
element's start-tag. Of course, XML documents are made up of more than just
elements so the previous sibling could be text, a comment, or something else.
If this node is the first child of the parent, this attribute will be
``None``. This is a read-only attribute.
Node.nextSibling~
The node that immediately follows this one with the same parent. See also
previousSibling. If this is the last child of the parent, this
attribute will be ``None``. This is a read-only attribute.
Node.childNodes~
A list of nodes contained within this node. This is a read-only attribute.
Node.firstChild~
The first child of the node, if there are any, or ``None``. This is a read-only
attribute.
Node.lastChild~
The last child of the node, if there are any, or ``None``. This is a read-only
attribute.
Node.localName~
The part of the tagName following the colon if there is one, else the
entire tagName. The value is a string.
Node.prefix~
The part of the tagName preceding the colon if there is one, else the
empty string. The value is a string, or ``None``
Node.namespaceURI~
The namespace associated with the element name. This will be a string or
``None``. This is a read-only attribute.
Node.nodeName~
This has a different meaning for each node type; see the DOM specification for
details. You can always get the information you would get here from another
property such as the tagName property for elements or the name
property for attributes. For all node types, the value of this attribute will be
either a string or ``None``. This is a read-only attribute.
Node.nodeValue~
This has a different meaning for each node type; see the DOM specification for
details. The situation is similar to that with nodeName. The value is
a string or ``None``.
Node.hasAttributes()~
Returns true if the node has any attributes.
Node.hasChildNodes()~
Returns true if the node has any child nodes.
Node.isSameNode(other)~
Returns true if {other} refers to the same node as this node. This is especially
useful for DOM implementations which use any sort of proxy architecture (because
more than one object can refer to the same node).
.. note:: >
This is based on a proposed DOM Level 3 API which is still in the "working
draft" stage, but this particular interface appears uncontroversial. Changes
from the W3C will not necessarily affect this method in the Python DOM interface
(though any new W3C API for this would also be supported).
<
Node.appendChild(newChild)~
Add a new child node to this node at the end of the list of
children, returning {newChild}. If the node was already in
in the tree, it is removed first.
Node.insertBefore(newChild, refChild)~
Insert a new child node before an existing child. It must be the case that
{refChild} is a child of this node; if not, ValueError is raised.
{newChild} is returned. If {refChild} is ``None``, it inserts {newChild} at the
end of the children's list.
Node.removeChild(oldChild)~
Remove a child node. {oldChild} must be a child of this node; if not,
ValueError is raised. {oldChild} is returned on success. If {oldChild}
will not be used further, its unlink method should be called.
Node.replaceChild(newChild, oldChild)~
Replace an existing node with a new node. It must be the case that {oldChild}
is a child of this node; if not, ValueError is raised.
Node.normalize()~
Join adjacent text nodes so that all stretches of text are stored as single
Text instances. This simplifies processing text from a DOM tree for
many applications.
.. versionadded:: 2.1
Node.cloneNode(deep)~
Clone this node. Setting {deep} means to clone all child nodes as well. This
returns the clone.
NodeList Objects
^^^^^^^^^^^^^^^^
A NodeList represents a sequence of nodes. These objects are used in
two ways in the DOM Core recommendation: the Element objects provides
one as its list of child nodes, and the getElementsByTagName and
getElementsByTagNameNS methods of Node return objects with this
interface to represent query results.
The DOM Level 2 recommendation defines one method and one attribute for these
objects:
NodeList.item(i)~
Return the {i}'th item from the sequence, if there is one, or ``None``. The
index {i} is not allowed to be less then zero or greater than or equal to the
length of the sequence.
NodeList.length~
The number of nodes in the sequence.
In addition, the Python DOM interface requires that some additional support is
provided to allow NodeList objects to be used as Python sequences. All
NodeList implementations must include support for __len__ and
__getitem__; this allows iteration over the NodeList in
for statements and proper support for the len built-in
function.
If a DOM implementation supports modification of the document, the
NodeList implementation must also support the __setitem__ and
__delitem__ methods.
DocumentType Objects
^^^^^^^^^^^^^^^^^^^^
Information about the notations and entities declared by a document (including
the external subset if the parser uses it and can provide the information) is
available from a DocumentType object. The DocumentType for a
document is available from the Document object's doctype
attribute; if there is no ``DOCTYPE`` declaration for the document, the
document's doctype attribute will be set to ``None`` instead of an
instance of this interface.
DocumentType is a specialization of Node, and adds the
following attributes:
DocumentType.publicId~
The public identifier for the external subset of the document type definition.
This will be a string or ``None``.
DocumentType.systemId~
The system identifier for the external subset of the document type definition.
This will be a URI as a string, or ``None``.
DocumentType.internalSubset~
A string giving the complete internal subset from the document. This does not
include the brackets which enclose the subset. If the document has no internal
subset, this should be ``None``.
DocumentType.name~
The name of the root element as given in the ``DOCTYPE`` declaration, if
present.
DocumentType.entities~
This is a NamedNodeMap giving the definitions of external entities.
For entity names defined more than once, only the first definition is provided
(others are ignored as required by the XML recommendation). This may be
``None`` if the information is not provided by the parser, or if no entities are
defined.
DocumentType.notations~
This is a NamedNodeMap giving the definitions of notations. For
notation names defined more than once, only the first definition is provided
(others are ignored as required by the XML recommendation). This may be
``None`` if the information is not provided by the parser, or if no notations
are defined.
Document Objects
^^^^^^^^^^^^^^^^
A Document represents an entire XML document, including its constituent
elements, attributes, processing instructions, comments etc. Remember that it
inherits properties from Node.
Document.documentElement~
The one and only root element of the document.
Document.createElement(tagName)~
Create and return a new element node. The element is not inserted into the
document when it is created. You need to explicitly insert it with one of the
other methods such as insertBefore or appendChild.
Document.createElementNS(namespaceURI, tagName)~
Create and return a new element with a namespace. The {tagName} may have a
prefix. The element is not inserted into the document when it is created. You
need to explicitly insert it with one of the other methods such as
insertBefore or appendChild.
Document.createTextNode(data)~
Create and return a text node containing the data passed as a parameter. As
with the other creation methods, this one does not insert the node into the
tree.
Document.createComment(data)~
Create and return a comment node containing the data passed as a parameter. As
with the other creation methods, this one does not insert the node into the
tree.
Document.createProcessingInstruction(target, data)~
Create and return a processing instruction node containing the {target} and
{data} passed as parameters. As with the other creation methods, this one does
not insert the node into the tree.
Document.createAttribute(name)~
Create and return an attribute node. This method does not associate the
attribute node with any particular element. You must use
setAttributeNode on the appropriate Element object to use the
newly created attribute instance.
Document.createAttributeNS(namespaceURI, qualifiedName)~
Create and return an attribute node with a namespace. The {tagName} may have a
prefix. This method does not associate the attribute node with any particular
element. You must use setAttributeNode on the appropriate
Element object to use the newly created attribute instance.
Document.getElementsByTagName(tagName)~
Search for all descendants (direct children, children's children, etc.) with a
particular element type name.
Document.getElementsByTagNameNS(namespaceURI, localName)~
Search for all descendants (direct children, children's children, etc.) with a
particular namespace URI and localname. The localname is the part of the
namespace after the prefix.
Element Objects
^^^^^^^^^^^^^^^
Element is a subclass of Node, so inherits all the attributes
of that class.
Element.tagName~
The element type name. In a namespace-using document it may have colons in it.
The value is a string.
Element.getElementsByTagName(tagName)~
Same as equivalent method in the Document class.
Element.getElementsByTagNameNS(namespaceURI, localName)~
Same as equivalent method in the Document class.
Element.hasAttribute(name)~
Returns true if the element has an attribute named by {name}.
Element.hasAttributeNS(namespaceURI, localName)~
Returns true if the element has an attribute named by {namespaceURI} and
{localName}.
Element.getAttribute(name)~
Return the value of the attribute named by {name} as a string. If no such
attribute exists, an empty string is returned, as if the attribute had no value.
Element.getAttributeNode(attrname)~
Return the Attr node for the attribute named by {attrname}.
Element.getAttributeNS(namespaceURI, localName)~
Return the value of the attribute named by {namespaceURI} and {localName} as a
string. If no such attribute exists, an empty string is returned, as if the
attribute had no value.
Element.getAttributeNodeNS(namespaceURI, localName)~
Return an attribute value as a node, given a {namespaceURI} and {localName}.
Element.removeAttribute(name)~
Remove an attribute by name. If there is no matching attribute, a
NotFoundErr is raised.
Element.removeAttributeNode(oldAttr)~
Remove and return {oldAttr} from the attribute list, if present. If {oldAttr} is
not present, NotFoundErr is raised.
Element.removeAttributeNS(namespaceURI, localName)~
Remove an attribute by name. Note that it uses a localName, not a qname. No
exception is raised if there is no matching attribute.
Element.setAttribute(name, value)~
Set an attribute value from a string.
Element.setAttributeNode(newAttr)~
Add a new attribute node to the element, replacing an existing attribute if
necessary if the name attribute matches. If a replacement occurs, the
old attribute node will be returned. If {newAttr} is already in use,
InuseAttributeErr will be raised.
Element.setAttributeNodeNS(newAttr)~
Add a new attribute node to the element, replacing an existing attribute if
necessary if the namespaceURI and localName attributes match.
If a replacement occurs, the old attribute node will be returned. If {newAttr}
is already in use, InuseAttributeErr will be raised.
Element.setAttributeNS(namespaceURI, qname, value)~
Set an attribute value from a string, given a {namespaceURI} and a {qname}.
Note that a qname is the whole attribute name. This is different than above.
Attr Objects
^^^^^^^^^^^^
Attr inherits from Node, so inherits all its attributes.
Attr.name~
The attribute name.
In a namespace-using document it may include a colon.
Attr.localName~
The part of the name following the colon if there is one, else the
entire name.
This is a read-only attribute.
Attr.prefix~
The part of the name preceding the colon if there is one, else the
empty string.
Attr.value~
The text value of the attribute. This is a synonym for the
nodeValue attribute.
NamedNodeMap Objects
^^^^^^^^^^^^^^^^^^^^
NamedNodeMap does {not} inherit from Node.
NamedNodeMap.length~
The length of the attribute list.
NamedNodeMap.item(index)~
Return an attribute with a particular index. The order you get the attributes
in is arbitrary but will be consistent for the life of a DOM. Each item is an
attribute node. Get its value with the value attribute.
There are also experimental methods that give this class more mapping behavior.
You can use them or you can use the standardized getAttribute\* family
of methods on the Element objects.
Comment Objects
^^^^^^^^^^^^^^^
Comment represents a comment in the XML document. It is a subclass of
Node, but cannot have child nodes.
Comment.data~
The content of the comment as a string. The attribute contains all characters
between the leading ``<!-``\ ``-`` and trailing ``-``\ ``->``, but does not
include them.
Text and CDATASection Objects
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The Text interface represents text in the XML document. If the parser
and DOM implementation support the DOM's XML extension, portions of the text
enclosed in CDATA marked sections are stored in CDATASection objects.
These two interfaces are identical, but provide different values for the
nodeType attribute.
These interfaces extend the Node interface. They cannot have child
nodes.
Text.data~
The content of the text node as a string.
.. note::
The use of a CDATASection node does not indicate that the node
represents a complete CDATA marked section, only that the content of the node
was part of a CDATA section. A single CDATA section may be represented by more
than one node in the document tree. There is no way to determine whether two
adjacent CDATASection nodes represent different CDATA marked sections.
ProcessingInstruction Objects
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Represents a processing instruction in the XML document; this inherits from the
Node interface and cannot have child nodes.
ProcessingInstruction.target~
The content of the processing instruction up to the first whitespace character.
This is a read-only attribute.
ProcessingInstruction.data~
The content of the processing instruction following the first whitespace
character.
Exceptions
^^^^^^^^^^
.. versionadded:: 2.1
The DOM Level 2 recommendation defines a single exception, DOMException,
and a number of constants that allow applications to determine what sort of
error occurred. DOMException instances carry a code (|py2stdlib-code|) attribute
that provides the appropriate value for the specific exception.
The Python DOM interface provides the constants, but also expands the set of
exceptions so that a specific exception exists for each of the exception codes
defined by the DOM. The implementations must raise the appropriate specific
exception, each of which carries the appropriate value for the code (|py2stdlib-code|)
attribute.
DOMException~
Base exception class used for all specific DOM exceptions. This exception class
cannot be directly instantiated.
DomstringSizeErr~
Raised when a specified range of text does not fit into a string. This is not
known to be used in the Python DOM implementations, but may be received from DOM
implementations not written in Python.
HierarchyRequestErr~
Raised when an attempt is made to insert a node where the node type is not
allowed.
IndexSizeErr~
Raised when an index or size parameter to a method is negative or exceeds the
allowed values.
InuseAttributeErr~
Raised when an attempt is made to insert an Attr node that is already
present elsewhere in the document.
InvalidAccessErr~
Raised if a parameter or an operation is not supported on the underlying object.
InvalidCharacterErr~
This exception is raised when a string parameter contains a character that is
not permitted in the context it's being used in by the XML 1.0 recommendation.
For example, attempting to create an Element node with a space in the
element type name will cause this error to be raised.
InvalidModificationErr~
Raised when an attempt is made to modify the type of a node.
InvalidStateErr~
Raised when an attempt is made to use an object that is not defined or is no
longer usable.
NamespaceErr~
If an attempt is made to change any object in a way that is not permitted with
regard to the `Namespaces in XML <http://www.w3.org/TR/REC-xml-names/>`_
recommendation, this exception is raised.
NotFoundErr~
Exception when a node does not exist in the referenced context. For example,
NamedNodeMap.removeNamedItem will raise this if the node passed in does
not exist in the map.
NotSupportedErr~
Raised when the implementation does not support the requested type of object or
operation.
NoDataAllowedErr~
This is raised if data is specified for a node which does not support data.
.. XXX a better explanation is needed!
NoModificationAllowedErr~
Raised on attempts to modify an object where modifications are not allowed (such
as for read-only nodes).
SyntaxErr~
Raised when an invalid or illegal string is specified.
.. XXX how is this different from InvalidCharacterErr?
WrongDocumentErr~
Raised when a node is inserted in a different document than it currently belongs
to, and the implementation does not support migrating the node from one document
to the other.
The exception codes defined in the DOM recommendation map to the exceptions
described above according to this table:
+--------------------------------------+---------------------------------+
| Constant | Exception |
+======================================+=================================+
| DOMSTRING_SIZE_ERR | DomstringSizeErr |
+--------------------------------------+---------------------------------+
| HIERARCHY_REQUEST_ERR | HierarchyRequestErr |
+--------------------------------------+---------------------------------+
| INDEX_SIZE_ERR | IndexSizeErr |
+--------------------------------------+---------------------------------+
| INUSE_ATTRIBUTE_ERR | InuseAttributeErr |
+--------------------------------------+---------------------------------+
| INVALID_ACCESS_ERR | InvalidAccessErr |
+--------------------------------------+---------------------------------+
| INVALID_CHARACTER_ERR | InvalidCharacterErr |
+--------------------------------------+---------------------------------+
| INVALID_MODIFICATION_ERR | InvalidModificationErr |
+--------------------------------------+---------------------------------+
| INVALID_STATE_ERR | InvalidStateErr |
+--------------------------------------+---------------------------------+
| NAMESPACE_ERR | NamespaceErr |
+--------------------------------------+---------------------------------+
| NOT_FOUND_ERR | NotFoundErr |
+--------------------------------------+---------------------------------+
| NOT_SUPPORTED_ERR | NotSupportedErr |
+--------------------------------------+---------------------------------+
| NO_DATA_ALLOWED_ERR | NoDataAllowedErr |
+--------------------------------------+---------------------------------+
| NO_MODIFICATION_ALLOWED_ERR | NoModificationAllowedErr |
+--------------------------------------+---------------------------------+
| SYNTAX_ERR | SyntaxErr |
+--------------------------------------+---------------------------------+
| WRONG_DOCUMENT_ERR | WrongDocumentErr |
+--------------------------------------+---------------------------------+
Conformance
-----------
This section describes the conformance requirements and relationships between
the Python DOM API, the W3C DOM recommendations, and the OMG IDL mapping for
Python.
Type Mapping
^^^^^^^^^^^^
The primitive IDL types used in the DOM specification are mapped to Python types
according to the following table.
+------------------+-------------------------------------------+
| IDL Type | Python Type |
+==================+===========================================+
| ``boolean`` | ``IntegerType`` (with a value of ``0`` or |
| | ``1``) |
+------------------+-------------------------------------------+
| ``int`` | ``IntegerType`` |
+------------------+-------------------------------------------+
| ``long int`` | ``IntegerType`` |
+------------------+-------------------------------------------+
| ``unsigned int`` | ``IntegerType`` |
+------------------+-------------------------------------------+
Additionally, the DOMString defined in the recommendation is mapped to
a Python string or Unicode string. Applications should be able to handle
Unicode whenever a string is returned from the DOM.
The IDL ``null`` value is mapped to ``None``, which may be accepted or
provided by the implementation whenever ``null`` is allowed by the API.
Accessor Methods
^^^^^^^^^^^^^^^^
The mapping from OMG IDL to Python defines accessor functions for IDL
``attribute`` declarations in much the way the Java mapping does.
Mapping the IDL declarations :: >
readonly attribute string someValue;
attribute string anotherValue;
<
yields three accessor functions: a "get" method for someValue
(_get_someValue), and "get" and "set" methods for anotherValue
(_get_anotherValue and _set_anotherValue). The mapping, in
particular, does not require that the IDL attributes are accessible as normal
Python attributes: ``object.someValue`` is {not} required to work, and may
raise an AttributeError.
The Python DOM API, however, {does} require that normal attribute access work.
This means that the typical surrogates generated by Python IDL compilers are not
likely to work, and wrapper objects may be needed on the client if the DOM
objects are accessed via CORBA. While this does require some additional
consideration for CORBA DOM clients, the implementers with experience using DOM
over CORBA from Python do not consider this a problem. Attributes that are
declared ``readonly`` may not restrict write access in all DOM
implementations.
In the Python DOM API, accessor functions are not required. If provided, they
should take the form defined by the Python IDL mapping, but these methods are
considered unnecessary since the attributes are accessible directly from Python.
"Set" accessors should never be provided for ``readonly`` attributes.
The IDL definitions do not fully embody the requirements of the W3C DOM API,
such as the notion of certain objects, such as the return value of
getElementsByTagName, being "live". The Python DOM API does not require
implementations to enforce such requirements.
==============================================================================
*py2stdlib-xml.etree.elementtree*
xml.etree.ElementTree~
:synopsis: Implementation of the ElementTree API.
.. versionadded:: 2.5
The Element type is a flexible container object, designed to store
hierarchical data structures in memory. The type can be described as a cross
between a list and a dictionary.
Each element has a number of properties associated with it:
* a tag which is a string identifying what kind of data this element represents
(the element type, in other words).
* a number of attributes, stored in a Python dictionary.
* a text string.
* an optional tail string.
* a number of child elements, stored in a Python sequence
To create an element instance, use the Element constructor or the
SubElement factory function.
The ElementTree class can be used to wrap an element structure, and
convert it from and to XML.
A C implementation of this API is available as xml.etree.cElementTree.
See http://effbot.org/zone/element-index.htm for tutorials and links to other
docs. Fredrik Lundh's page is also the location of the development version of
the xml.etree.ElementTree.
.. versionchanged:: 2.7
The ElementTree API is updated to 1.3. For more information, see
`Introducing ElementTree 1.3
<http://effbot.org/zone/elementtree-13-intro.htm>`_.
Functions
---------
Comment(text=None)~
Comment element factory. This factory function creates a special element
that will be serialized as an XML comment by the standard serializer. The
comment string can be either a bytestring or a Unicode string. {text} is a
string containing the comment string. Returns an element instance
representing a comment.
dump(elem)~
Writes an element tree or element structure to sys.stdout. This function
should be used for debugging only.
The exact output format is implementation dependent. In this version, it's
written as an ordinary XML file.
{elem} is an element tree or an individual element.
fromstring(text)~
Parses an XML section from a string constant. Same as XML. {text}
is a string containing XML data. Returns an Element instance.
fromstringlist(sequence, parser=None)~
Parses an XML document from a sequence of string fragments. {sequence} is a
list or other sequence containing XML data fragments. {parser} is an
optional parser instance. If not given, the standard XMLParser
parser is used. Returns an Element instance.
.. versionadded:: 2.7
iselement(element)~
Checks if an object appears to be a valid element object. {element} is an
element instance. Returns a true value if this is an element object.
iterparse(source, events=None, parser=None)~
Parses an XML section into an element tree incrementally, and reports what's
going on to the user. {source} is a filename or file object containing XML
data. {events} is a list of events to report back. If omitted, only "end"
events are reported. {parser} is an optional parser instance. If not
given, the standard XMLParser parser is used. Returns an
iterator providing ``(event, elem)`` pairs.
.. note:: >
iterparse only guarantees that it has seen the ">"
character of a starting tag when it emits a "start" event, so the
attributes are defined, but the contents of the text and tail attributes
are undefined at that point. The same applies to the element children;
they may or may not be present.
If you need a fully populated element, look for "end" events instead.
<
parse(source, parser=None)~
Parses an XML section into an element tree. {source} is a filename or file
object containing XML data. {parser} is an optional parser instance. If
not given, the standard XMLParser parser is used. Returns an
ElementTree instance.
ProcessingInstruction(target, text=None)~
PI element factory. This factory function creates a special element that
will be serialized as an XML processing instruction. {target} is a string
containing the PI target. {text} is a string containing the PI contents, if
given. Returns an element instance, representing a processing instruction.
register_namespace(prefix, uri)~
Registers a namespace prefix. The registry is global, and any existing
mapping for either the given prefix or the namespace URI will be removed.
{prefix} is a namespace prefix. {uri} is a namespace uri. Tags and
attributes in this namespace will be serialized with the given prefix, if at
all possible.
.. versionadded:: 2.7
SubElement(parent, tag, attrib={}, {}extra)~
Subelement factory. This function creates an element instance, and appends
it to an existing element.
The element name, attribute names, and attribute values can be either
bytestrings or Unicode strings. {parent} is the parent element. {tag} is
the subelement name. {attrib} is an optional dictionary, containing element
attributes. {extra} contains additional attributes, given as keyword
arguments. Returns an element instance.
tostring(element, encoding="us-ascii", method="xml")~
Generates a string representation of an XML element, including all
subelements. {element} is an Element instance. {encoding} [1]_ is
the output encoding (default is US-ASCII). {method} is either ``"xml"``,
``"html"`` or ``"text"`` (default is ``"xml"``). Returns an encoded string
containing the XML data.
tostringlist(element, encoding="us-ascii", method="xml")~
Generates a string representation of an XML element, including all
subelements. {element} is an Element instance. {encoding} [1]_ is
the output encoding (default is US-ASCII). {method} is either ``"xml"``,
``"html"`` or ``"text"`` (default is ``"xml"``). Returns a list of encoded
strings containing the XML data. It does not guarantee any specific
sequence, except that ``"".join(tostringlist(element)) ==
tostring(element)``.
.. versionadded:: 2.7
XML(text, parser=None)~
Parses an XML section from a string constant. This function can be used to
embed "XML literals" in Python code. {text} is a string containing XML
data. {parser} is an optional parser instance. If not given, the standard
XMLParser parser is used. Returns an Element instance.
XMLID(text, parser=None)~
Parses an XML section from a string constant, and also returns a dictionary
which maps from element id:s to elements. {text} is a string containing XML
data. {parser} is an optional parser instance. If not given, the standard
XMLParser parser is used. Returns a tuple containing an
Element instance and a dictionary.
Element Objects
---------------
Element(tag, attrib={}, {}extra)~
Element class. This class defines the Element interface, and provides a
reference implementation of this interface.
The element name, attribute names, and attribute values can be either
bytestrings or Unicode strings. {tag} is the element name. {attrib} is
an optional dictionary, containing element attributes. {extra} contains
additional attributes, given as keyword arguments.
tag~
A string identifying what kind of data this element represents (the
element type, in other words).
text~
The {text} attribute can be used to hold additional data associated with
the element. As the name implies this attribute is usually a string but
may be any application-specific object. If the element is created from
an XML file the attribute will contain any text found between the element
tags.
tail~
The {tail} attribute can be used to hold additional data associated with
the element. This attribute is usually a string but may be any
application-specific object. If the element is created from an XML file
the attribute will contain any text found after the element's end tag and
before the next tag.
attrib~
A dictionary containing the element's attributes. Note that while the
{attrib} value is always a real mutable Python dictionary, an ElementTree
implementation may choose to use another internal representation, and
create the dictionary only if someone asks for it. To take advantage of
such implementations, use the dictionary methods below whenever possible.
The following dictionary-like methods work on the element attributes.
clear()~
Resets an element. This function removes all subelements, clears all
attributes, and sets the text and tail attributes to None.
get(key, default=None)~
Gets the element attribute named {key}.
Returns the attribute value, or {default} if the attribute was not found.
items()~
Returns the element attributes as a sequence of (name, value) pairs. The
attributes are returned in an arbitrary order.
keys()~
Returns the elements attribute names as a list. The names are returned
in an arbitrary order.
set(key, value)~
Set the attribute {key} on the element to {value}.
The following methods work on the element's children (subelements).
append(subelement)~
Adds the element {subelement} to the end of this elements internal list
of subelements.
extend(subelements)~
Appends {subelements} from a sequence object with zero or more elements.
Raises AssertionError if a subelement is not a valid object.
.. versionadded:: 2.7
find(match)~
Finds the first subelement matching {match}. {match} may be a tag name
or path. Returns an element instance or ``None``.
findall(match)~
Finds all matching subelements, by tag name or path. Returns a list
containing all matching elements in document order.
findtext(match, default=None)~
Finds text for the first subelement matching {match}. {match} may be
a tag name or path. Returns the text content of the first matching
element, or {default} if no element was found. Note that if the matching
element has no text content an empty string is returned.
getchildren()~
2.7~
Use ``list(elem)`` or iteration.
getiterator(tag=None)~
2.7~
Use method Element.iter instead.
insert(index, element)~
Inserts a subelement at the given position in this element.
iter(tag=None)~
Creates a tree iterator with the current element as the root.
The iterator iterates over this element and all elements below it, in
document (depth first) order. If {tag} is not ``None`` or ``'*'``, only
elements whose tag equals {tag} are returned from the iterator. If the
tree structure is modified during iteration, the result is undefined.
iterfind(match)~
Finds all matching subelements, by tag name or path. Returns an iterable
yielding all matching elements in document order.
.. versionadded:: 2.7
itertext()~
Creates a text iterator. The iterator loops over this element and all
subelements, in document order, and returns all inner text.
.. versionadded:: 2.7
makeelement(tag, attrib)~
Creates a new element object of the same type as this element. Do not
call this method, use the SubElement factory function instead.
remove(subelement)~
Removes {subelement} from the element. Unlike the find\* methods this
method compares elements based on the instance identity, not on tag value
or contents.
Element objects also support the following sequence type methods
for working with subelements: __delitem__, __getitem__,
__setitem__, __len__.
Caution: Elements with no subelements will test as ``False``. This behavior
will change in future versions. Use specific ``len(elem)`` or ``elem is
None`` test instead. :: >
element = root.find('foo')
if not element: # careful!
print "element not found, or element has no subelements"
if element is None:
print "element not found"
<
ElementTree Objects
ElementTree(element=None, file=None)~
ElementTree wrapper class. This class represents an entire element
hierarchy, and adds some extra support for serialization to and from
standard XML.
{element} is the root element. The tree is initialized with the contents
of the XML {file} if given.
_setroot(element)~
Replaces the root element for this tree. This discards the current
contents of the tree, and replaces it with the given element. Use with
care. {element} is an element instance.
find(match)~
Finds the first toplevel element matching {match}. {match} may be a tag
name or path. Same as getroot().find(match). Returns the first matching
element, or ``None`` if no element was found.
findall(match)~
Finds all matching subelements, by tag name or path. Same as
getroot().findall(match). {match} may be a tag name or path. Returns a
list containing all matching elements, in document order.
findtext(match, default=None)~
Finds the element text for the first toplevel element with given tag.
Same as getroot().findtext(match). {match} may be a tag name or path.
{default} is the value to return if the element was not found. Returns
the text content of the first matching element, or the default value no
element was found. Note that if the element is found, but has no text
content, this method returns an empty string.
getiterator(tag=None)~
2.7~
Use method ElementTree.iter instead.
getroot()~
Returns the root element for this tree.
iter(tag=None)~
Creates and returns a tree iterator for the root element. The iterator
loops over all elements in this tree, in section order. {tag} is the tag
to look for (default is to return all elements)
iterfind(match)~
Finds all matching subelements, by tag name or path. Same as
getroot().iterfind(match). Returns an iterable yielding all matching
elements in document order.
.. versionadded:: 2.7
parse(source, parser=None)~
Loads an external XML section into this element tree. {source} is a file
name or file object. {parser} is an optional parser instance. If not
given, the standard XMLParser parser is used. Returns the section
root element.
write(file, encoding="us-ascii", xml_declaration=None, method="xml")~
Writes the element tree to a file, as XML. {file} is a file name, or a
file object opened for writing. {encoding} [1]_ is the output encoding
(default is US-ASCII). {xml_declaration} controls if an XML declaration
should be added to the file. Use False for never, True for always, None
for only if not US-ASCII or UTF-8 (default is None). {method} is either
``"xml"``, ``"html"`` or ``"text"`` (default is ``"xml"``). Returns an
encoded string.
This is the XML file that is going to be manipulated:: >
<html>
<head>
<title>Example page</title>
</head>
<body>
<p>Moved to <a href="http://example.org/">example.org</a>
or <a href="http://example.com/">example.com</a>.</p>
</body>
</html>
<
Example of changing the attribute "target" of every link in first paragraph::
>>> from xml.etree.ElementTree import ElementTree
>>> tree = ElementTree()
>>> tree.parse("index.xhtml")
<Element 'html' at 0xb77e6fac>
>>> p = tree.find("body/p") # Finds first occurrence of tag p in body
>>> p
<Element 'p' at 0xb77ec26c>
>>> links = list(p.iter("a")) # Returns list of all links
>>> links
[<Element 'a' at 0xb77ec2ac>, <Element 'a' at 0xb77ec1cc>]
>>> for i in links: # Iterates through all found links
... i.attrib["target"] = "blank"
>>> tree.write("output.xhtml")
QName Objects
-------------
QName(text_or_uri, tag=None)~
QName wrapper. This can be used to wrap a QName attribute value, in order
to get proper namespace handling on output. {text_or_uri} is a string
containing the QName value, in the form {uri}local, or, if the tag argument
is given, the URI part of a QName. If {tag} is given, the first argument is
interpreted as an URI, and this argument is interpreted as a local name.
QName instances are opaque.
TreeBuilder Objects
-------------------
TreeBuilder(element_factory=None)~
Generic element structure builder. This builder converts a sequence of
start, data, and end method calls to a well-formed element structure. You
can use this class to build an element structure using a custom XML parser,
or a parser for some other XML-like format. The {element_factory} is called
to create new Element instances when given.
close()~
Flushes the builder buffers, and returns the toplevel document
element. Returns an Element instance.
data(data)~
Adds text to the current element. {data} is a string. This should be
either a bytestring, or a Unicode string.
end(tag)~
Closes the current element. {tag} is the element name. Returns the
closed element.
start(tag, attrs)~
Opens a new element. {tag} is the element name. {attrs} is a dictionary
containing element attributes. Returns the opened element.
In addition, a custom TreeBuilder object can provide the
following method:
doctype(name, pubid, system)~
Handles a doctype declaration. {name} is the doctype name. {pubid} is
the public identifier. {system} is the system identifier. This method
does not exist on the default TreeBuilder class.
.. versionadded:: 2.7
XMLParser Objects
-----------------
XMLParser(html=0, target=None, encoding=None)~
Element structure builder for XML source data, based on the expat
parser. {html} are predefined HTML entities. This flag is not supported by
the current implementation. {target} is the target object. If omitted, the
builder uses an instance of the standard TreeBuilder class. {encoding} [1]_
is optional. If given, the value overrides the encoding specified in the
XML file.
close()~
Finishes feeding data to the parser. Returns an element structure.
doctype(name, pubid, system)~
2.7~
Define the TreeBuilder.doctype method on a custom TreeBuilder
target.
feed(data)~
Feeds data to the parser. {data} is encoded data.
XMLParser.feed calls {target}\'s start method
for each opening tag, its end method for each closing tag,
and data is processed by method data. XMLParser.close
calls {target}\'s method close.
XMLParser can be used not only for building a tree structure.
This is an example of counting the maximum depth of an XML file:: >
>>> from xml.etree.ElementTree import XMLParser
>>> class MaxDepth: # The target object of the parser
... maxDepth = 0
... depth = 0
... def start(self, tag, attrib): # Called for each opening tag.
... self.depth += 1
... if self.depth > self.maxDepth:
... self.maxDepth = self.depth
... def end(self, tag): # Called for each closing tag.
... self.depth -= 1
... def data(self, data):
... pass # We do not need to do anything with data.
... def close(self): # Called when all data has been parsed.
... return self.maxDepth
...
>>> target = MaxDepth()
>>> parser = XMLParser(target=target)
>>> exampleXml = """
... <a>
... <b>
... </b>
... <b>
... <c>
... <d>
... </d>
... </c>
... </b>
... </a>"""
>>> parser.feed(exampleXml)
>>> parser.close()
4
<
.. rubric:: Footnotes
.. [#] The encoding string included in XML output should conform to the
appropriate standards. For example, "UTF-8" is valid, but "UTF8" is
not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
and http://www.iana.org/assignments/character-sets.
==============================================================================
*py2stdlib-xml.sax.handler*
xml.sax.handler~
:synopsis: Base classes for SAX event handlers.
.. versionadded:: 2.0
The SAX API defines four kinds of handlers: content handlers, DTD handlers,
error handlers, and entity resolvers. Applications normally only need to
implement those interfaces whose events they are interested in; they can
implement the interfaces in a single object or in multiple objects. Handler
implementations should inherit from the base classes provided in the module
xml.sax.handler (|py2stdlib-xml.sax.handler|), so that all methods get default implementations.
ContentHandler~
This is the main callback interface in SAX, and the one most important to
applications. The order of events in this interface mirrors the order of the
information in the document.
DTDHandler~
Handle DTD events.
This interface specifies only those DTD events required for basic parsing
(unparsed entities and attributes).
EntityResolver~
Basic interface for resolving entities. If you create an object implementing
this interface, then register the object with your Parser, the parser will call
the method in your object to resolve all external entities.
ErrorHandler~
Interface used by the parser to present error and warning messages to the
application. The methods of this object control whether errors are immediately
converted to exceptions or are handled in some other way.
In addition to these classes, xml.sax.handler (|py2stdlib-xml.sax.handler|) provides symbolic constants
for the feature and property names.
feature_namespaces~
Value: ``"http://xml.org/sax/features/namespaces"`` --- true: Perform Namespace
processing. --- false: Optionally do not perform Namespace processing (implies
namespace-prefixes; default). --- access: (parsing) read-only; (not parsing)
read/write
feature_namespace_prefixes~
Value: ``"http://xml.org/sax/features/namespace-prefixes"`` --- true: Report
the original prefixed names and attributes used for Namespace
declarations. --- false: Do not report attributes used for Namespace
declarations, and optionally do not report original prefixed names
(default). --- access: (parsing) read-only; (not parsing) read/write
feature_string_interning~
Value: ``"http://xml.org/sax/features/string-interning"`` --- true: All element
names, prefixes, attribute names, Namespace URIs, and local names are interned
using the built-in intern function. --- false: Names are not necessarily
interned, although they may be (default). --- access: (parsing) read-only; (not
parsing) read/write
feature_validation~
Value: ``"http://xml.org/sax/features/validation"`` --- true: Report all
validation errors (implies external-general-entities and
external-parameter-entities). --- false: Do not report validation errors. ---
access: (parsing) read-only; (not parsing) read/write
feature_external_ges~
Value: ``"http://xml.org/sax/features/external-general-entities"`` --- true:
Include all external general (text) entities. --- false: Do not include
external general entities. --- access: (parsing) read-only; (not parsing)
read/write
feature_external_pes~
Value: ``"http://xml.org/sax/features/external-parameter-entities"`` --- true:
Include all external parameter entities, including the external DTD subset. ---
false: Do not include any external parameter entities, even the external DTD
subset. --- access: (parsing) read-only; (not parsing) read/write
all_features~
List of all features.
property_lexical_handler~
Value: ``"http://xml.org/sax/properties/lexical-handler"`` --- data type:
xml.sax.sax2lib.LexicalHandler (not supported in Python 2) --- description: An
optional extension handler for lexical events like comments. --- access:
read/write
property_declaration_handler~
Value: ``"http://xml.org/sax/properties/declaration-handler"`` --- data type:
xml.sax.sax2lib.DeclHandler (not supported in Python 2) --- description: An
optional extension handler for DTD-related events other than notations and
unparsed entities. --- access: read/write
property_dom_node~
Value: ``"http://xml.org/sax/properties/dom-node"`` --- data type:
org.w3c.dom.Node (not supported in Python 2) --- description: When parsing,
the current DOM node being visited if this is a DOM iterator; when not parsing,
the root DOM node for iteration. --- access: (parsing) read-only; (not parsing)
read/write
property_xml_string~
Value: ``"http://xml.org/sax/properties/xml-string"`` --- data type: String ---
description: The literal string of characters that was the source for the
current event. --- access: read-only
all_properties~
List of all known property names.
ContentHandler Objects
----------------------
Users are expected to subclass ContentHandler to support their
application. The following methods are called by the parser on the appropriate
events in the input document:
ContentHandler.setDocumentLocator(locator)~
Called by the parser to give the application a locator for locating the origin
of document events.
SAX parsers are strongly encouraged (though not absolutely required) to supply a
locator: if it does so, it must supply the locator to the application by
invoking this method before invoking any of the other methods in the
DocumentHandler interface.
The locator allows the application to determine the end position of any
document-related event, even if the parser is not reporting an error. Typically,
the application will use this information for reporting its own errors (such as
character content that does not match an application's business rules). The
information returned by the locator is probably not sufficient for use with a
search engine.
Note that the locator will return correct information only during the invocation
of the events in this interface. The application should not attempt to use it at
any other time.
ContentHandler.startDocument()~
Receive notification of the beginning of a document.
The SAX parser will invoke this method only once, before any other methods in
this interface or in DTDHandler (except for setDocumentLocator).
ContentHandler.endDocument()~
Receive notification of the end of a document.
The SAX parser will invoke this method only once, and it will be the last method
invoked during the parse. The parser shall not invoke this method until it has
either abandoned parsing (because of an unrecoverable error) or reached the end
of input.
ContentHandler.startPrefixMapping(prefix, uri)~
Begin the scope of a prefix-URI Namespace mapping.
The information from this event is not necessary for normal Namespace
processing: the SAX XML reader will automatically replace prefixes for element
and attribute names when the ``feature_namespaces`` feature is enabled (the
default).
There are cases, however, when applications need to use prefixes in character
data or in attribute values, where they cannot safely be expanded automatically;
the startPrefixMapping and endPrefixMapping events supply the
information to the application to expand prefixes in those contexts itself, if
necessary.
.. XXX This is not really the default, is it? MvL
Note that startPrefixMapping and endPrefixMapping events are not
guaranteed to be properly nested relative to each-other: all
startPrefixMapping events will occur before the corresponding
startElement event, and all endPrefixMapping events will occur
after the corresponding endElement event, but their order is not
guaranteed.
ContentHandler.endPrefixMapping(prefix)~
End the scope of a prefix-URI mapping.
See startPrefixMapping for details. This event will always occur after
the corresponding endElement event, but the order of
endPrefixMapping events is not otherwise guaranteed.
ContentHandler.startElement(name, attrs)~
Signals the start of an element in non-namespace mode.
The {name} parameter contains the raw XML 1.0 name of the element type as a
string and the {attrs} parameter holds an object of the Attributes
interface (see attributes-objects) containing the attributes of
the element. The object passed as {attrs} may be re-used by the parser; holding
on to a reference to it is not a reliable way to keep a copy of the attributes.
To keep a copy of the attributes, use the copy (|py2stdlib-copy|) method of the {attrs}
object.
ContentHandler.endElement(name)~
Signals the end of an element in non-namespace mode.
The {name} parameter contains the name of the element type, just as with the
startElement event.
ContentHandler.startElementNS(name, qname, attrs)~
Signals the start of an element in namespace mode.
The {name} parameter contains the name of the element type as a ``(uri,
localname)`` tuple, the {qname} parameter contains the raw XML 1.0 name used in
the source document, and the {attrs} parameter holds an instance of the
AttributesNS interface (see attributes-ns-objects)
containing the attributes of the element. If no namespace is associated with
the element, the {uri} component of {name} will be ``None``. The object passed
as {attrs} may be re-used by the parser; holding on to a reference to it is not
a reliable way to keep a copy of the attributes. To keep a copy of the
attributes, use the copy (|py2stdlib-copy|) method of the {attrs} object.
Parsers may set the {qname} parameter to ``None``, unless the
``feature_namespace_prefixes`` feature is activated.
ContentHandler.endElementNS(name, qname)~
Signals the end of an element in namespace mode.
The {name} parameter contains the name of the element type, just as with the
startElementNS method, likewise the {qname} parameter.
ContentHandler.characters(content)~
Receive notification of character data.
The Parser will call this method to report each chunk of character data. SAX
parsers may return all contiguous character data in a single chunk, or they may
split it into several chunks; however, all of the characters in any single event
must come from the same external entity so that the Locator provides useful
information.
{content} may be a Unicode string or a byte string; the ``expat`` reader module
produces always Unicode strings.
.. note:: >
The earlier SAX 1 interface provided by the Python XML Special Interest Group
used a more Java-like interface for this method. Since most parsers used from
Python did not take advantage of the older interface, the simpler signature was
chosen to replace it. To convert old code to the new interface, use {content}
instead of slicing content with the old {offset} and {length} parameters.
<
ContentHandler.ignorableWhitespace(whitespace)~
Receive notification of ignorable whitespace in element content.
Validating Parsers must use this method to report each chunk of ignorable
whitespace (see the W3C XML 1.0 recommendation, section 2.10): non-validating
parsers may also use this method if they are capable of parsing and using
content models.
SAX parsers may return all contiguous whitespace in a single chunk, or they may
split it into several chunks; however, all of the characters in any single event
must come from the same external entity, so that the Locator provides useful
information.
ContentHandler.processingInstruction(target, data)~
Receive notification of a processing instruction.
The Parser will invoke this method once for each processing instruction found:
note that processing instructions may occur before or after the main document
element.
A SAX parser should never report an XML declaration (XML 1.0, section 2.8) or a
text declaration (XML 1.0, section 4.3.1) using this method.
ContentHandler.skippedEntity(name)~
Receive notification of a skipped entity.
The Parser will invoke this method once for each entity skipped. Non-validating
processors may skip entities if they have not seen the declarations (because,
for example, the entity was declared in an external DTD subset). All processors
may skip external entities, depending on the values of the
``feature_external_ges`` and the ``feature_external_pes`` properties.
DTDHandler Objects
------------------
DTDHandler instances provide the following methods:
DTDHandler.notationDecl(name, publicId, systemId)~
Handle a notation declaration event.
DTDHandler.unparsedEntityDecl(name, publicId, systemId, ndata)~
Handle an unparsed entity declaration event.
EntityResolver Objects
----------------------
EntityResolver.resolveEntity(publicId, systemId)~
Resolve the system identifier of an entity and return either the system
identifier to read from as a string, or an InputSource to read from. The default
implementation returns {systemId}.
ErrorHandler Objects
--------------------
Objects with this interface are used to receive error and warning information
from the XMLReader. If you create an object that implements this
interface, then register the object with your XMLReader, the parser
will call the methods in your object to report all warnings and errors. There
are three levels of errors available: warnings, (possibly) recoverable errors,
and unrecoverable errors. All methods take a SAXParseException as the
only parameter. Errors and warnings may be converted to an exception by raising
the passed-in exception object.
ErrorHandler.error(exception)~
Called when the parser encounters a recoverable error. If this method does not
raise an exception, parsing may continue, but further document information
should not be expected by the application. Allowing the parser to continue may
allow additional errors to be discovered in the input document.
ErrorHandler.fatalError(exception)~
Called when the parser encounters an error it cannot recover from; parsing is
expected to terminate when this method returns.
ErrorHandler.warning(exception)~
Called when the parser presents minor warning information to the application.
Parsing is expected to continue when this method returns, and document
information will continue to be passed to the application. Raising an exception
in this method will cause parsing to end.
==============================================================================
*py2stdlib-xml.sax.xmlreader*
xml.sax.xmlreader~
:synopsis: Interface which SAX-compliant XML parsers must implement.
.. versionadded:: 2.0
SAX parsers implement the XMLReader interface. They are implemented in
a Python module, which must provide a function create_parser. This
function is invoked by xml.sax.make_parser with no arguments to create
a new parser object.
XMLReader()~
Base class which can be inherited by SAX parsers.
IncrementalParser()~
In some cases, it is desirable not to parse an input source at once, but to feed
chunks of the document as they get available. Note that the reader will normally
not read the entire file, but read it in chunks as well; still parse
won't return until the entire document is processed. So these interfaces should
be used if the blocking behaviour of parse is not desirable.
When the parser is instantiated it is ready to begin accepting data from the
feed method immediately. After parsing has been finished with a call to close
the reset method must be called to make the parser ready to accept new data,
either from feed or using the parse method.
Note that these methods must {not} be called during parsing, that is, after
parse has been called and before it returns.
By default, the class also implements the parse method of the XMLReader
interface using the feed, close and reset methods of the IncrementalParser
interface as a convenience to SAX 2.0 driver writers.
Locator()~
Interface for associating a SAX event with a document location. A locator object
will return valid results only during calls to DocumentHandler methods; at any
other time, the results are unpredictable. If information is not available,
methods may return ``None``.
InputSource([systemId])~
Encapsulation of the information needed by the XMLReader to read
entities.
This class may include information about the public identifier, system
identifier, byte stream (possibly with character encoding information) and/or
the character stream of an entity.
Applications will create objects of this class for use in the
XMLReader.parse method and for returning from
EntityResolver.resolveEntity.
An InputSource belongs to the application, the XMLReader is
not allowed to modify InputSource objects passed to it from the
application, although it may make copies and modify those.
AttributesImpl(attrs)~
This is an implementation of the Attributes interface (see section
attributes-objects). This is a dictionary-like object which
represents the element attributes in a startElement call. In addition
to the most useful dictionary operations, it supports a number of other
methods as described by the interface. Objects of this class should be
instantiated by readers; {attrs} must be a dictionary-like object containing
a mapping from attribute names to attribute values.
AttributesNSImpl(attrs, qnames)~
Namespace-aware variant of AttributesImpl, which will be passed to
startElementNS. It is derived from AttributesImpl, but
understands attribute names as two-tuples of {namespaceURI} and
{localname}. In addition, it provides a number of methods expecting qualified
names as they appear in the original document. This class implements the
AttributesNS interface (see section attributes-ns-objects).
XMLReader Objects
-----------------
The XMLReader interface supports the following methods:
XMLReader.parse(source)~
Process an input source, producing SAX events. The {source} object can be a
system identifier (a string identifying the input source -- typically a file
name or an URL), a file-like object, or an InputSource object. When
parse returns, the input is completely processed, and the parser object
can be discarded or reset. As a limitation, the current implementation only
accepts byte streams; processing of character streams is for further study.
XMLReader.getContentHandler()~
Return the current ContentHandler.
XMLReader.setContentHandler(handler)~
Set the current ContentHandler. If no ContentHandler is set,
content events will be discarded.
XMLReader.getDTDHandler()~
Return the current DTDHandler.
XMLReader.setDTDHandler(handler)~
Set the current DTDHandler. If no DTDHandler is set, DTD
events will be discarded.
XMLReader.getEntityResolver()~
Return the current EntityResolver.
XMLReader.setEntityResolver(handler)~
Set the current EntityResolver. If no EntityResolver is set,
attempts to resolve an external entity will result in opening the system
identifier for the entity, and fail if it is not available.
XMLReader.getErrorHandler()~
Return the current ErrorHandler.
XMLReader.setErrorHandler(handler)~
Set the current error handler. If no ErrorHandler is set, errors will
be raised as exceptions, and warnings will be printed.
XMLReader.setLocale(locale)~
Allow an application to set the locale for errors and warnings.
SAX parsers are not required to provide localization for errors and warnings; if
they cannot support the requested locale, however, they must throw a SAX
exception. Applications may request a locale change in the middle of a parse.
XMLReader.getFeature(featurename)~
Return the current setting for feature {featurename}. If the feature is not
recognized, SAXNotRecognizedException is raised. The well-known
featurenames are listed in the module xml.sax.handler (|py2stdlib-xml.sax.handler|).
XMLReader.setFeature(featurename, value)~
Set the {featurename} to {value}. If the feature is not recognized,
SAXNotRecognizedException is raised. If the feature or its setting is not
supported by the parser, {SAXNotSupportedException} is raised.
XMLReader.getProperty(propertyname)~
Return the current setting for property {propertyname}. If the property is not
recognized, a SAXNotRecognizedException is raised. The well-known
propertynames are listed in the module xml.sax.handler (|py2stdlib-xml.sax.handler|).
XMLReader.setProperty(propertyname, value)~
Set the {propertyname} to {value}. If the property is not recognized,
SAXNotRecognizedException is raised. If the property or its setting is
not supported by the parser, {SAXNotSupportedException} is raised.
IncrementalParser Objects
-------------------------
Instances of IncrementalParser offer the following additional methods:
IncrementalParser.feed(data)~
Process a chunk of {data}.
IncrementalParser.close()~
Assume the end of the document. That will check well-formedness conditions that
can be checked only at the end, invoke handlers, and may clean up resources
allocated during parsing.
IncrementalParser.reset()~
This method is called after close has been called to reset the parser so that it
is ready to parse new documents. The results of calling parse or feed after
close without calling reset are undefined.
Locator Objects
---------------
Instances of Locator provide these methods:
Locator.getColumnNumber()~
Return the column number where the current event ends.
Locator.getLineNumber()~
Return the line number where the current event ends.
Locator.getPublicId()~
Return the public identifier for the current event.
Locator.getSystemId()~
Return the system identifier for the current event.
InputSource Objects
-------------------
InputSource.setPublicId(id)~
Sets the public identifier of this InputSource.
InputSource.getPublicId()~
Returns the public identifier of this InputSource.
InputSource.setSystemId(id)~
Sets the system identifier of this InputSource.
InputSource.getSystemId()~
Returns the system identifier of this InputSource.
InputSource.setEncoding(encoding)~
Sets the character encoding of this InputSource.
The encoding must be a string acceptable for an XML encoding declaration (see
section 4.3.3 of the XML recommendation).
The encoding attribute of the InputSource is ignored if the
InputSource also contains a character stream.
InputSource.getEncoding()~
Get the character encoding of this InputSource.
InputSource.setByteStream(bytefile)~
Set the byte stream (a Python file-like object which does not perform
byte-to-character conversion) for this input source.
The SAX parser will ignore this if there is also a character stream specified,
but it will use a byte stream in preference to opening a URI connection itself.
If the application knows the character encoding of the byte stream, it should
set it with the setEncoding method.
InputSource.getByteStream()~
Get the byte stream for this input source.
The getEncoding method will return the character encoding for this byte stream,
or None if unknown.
InputSource.setCharacterStream(charfile)~
Set the character stream for this input source. (The stream must be a Python 1.6
Unicode-wrapped file-like that performs conversion to Unicode strings.)
If there is a character stream specified, the SAX parser will ignore any byte
stream and will not attempt to open a URI connection to the system identifier.
InputSource.getCharacterStream()~
Get the character stream for this input source.
The Attributes Interface
---------------------------------
Attributes objects implement a portion of the mapping protocol,
including the methods copy (|py2stdlib-copy|), get, has_key, items,
keys, and values. The following methods are also provided:
Attributes.getLength()~
Return the number of attributes.
Attributes.getNames()~
Return the names of the attributes.
Attributes.getType(name)~
Returns the type of the attribute {name}, which is normally ``'CDATA'``.
Attributes.getValue(name)~
Return the value of attribute {name}.
.. getValueByQName, getNameByQName, getQNameByName, getQNames available
.. here already, but documented only for derived class.
The AttributesNS Interface
-----------------------------------
This interface is a subtype of the Attributes interface (see section
attributes-objects). All methods supported by that interface are also
available on AttributesNS objects.
The following methods are also available:
AttributesNS.getValueByQName(name)~
Return the value for a qualified name.
AttributesNS.getNameByQName(name)~
Return the ``(namespace, localname)`` pair for a qualified {name}.
AttributesNS.getQNameByName(name)~
Return the qualified name for a ``(namespace, localname)`` pair.
AttributesNS.getQNames()~
Return the qualified names of all attributes.
==============================================================================
*py2stdlib-xml.sax*
xml.sax~
:synopsis: Package containing SAX2 base classes and convenience functions.
.. versionadded:: 2.0
The xml.sax (|py2stdlib-xml.sax|) package provides a number of modules which implement the
Simple API for XML (SAX) interface for Python. The package itself provides the
SAX exceptions and the convenience functions which will be most used by users of
the SAX API.
The convenience functions are:
make_parser([parser_list])~
Create and return a SAX XMLReader object. The first parser found will
be used. If {parser_list} is provided, it must be a sequence of strings which
name modules that have a function named create_parser. Modules listed
in {parser_list} will be used before modules in the default list of parsers.
parse(filename_or_stream, handler[, error_handler])~
Create a SAX parser and use it to parse a document. The document, passed in as
{filename_or_stream}, can be a filename or a file object. The {handler}
parameter needs to be a SAX ContentHandler instance. If
{error_handler} is given, it must be a SAX ErrorHandler instance; if
omitted, SAXParseException will be raised on all errors. There is no
return value; all work must be done by the {handler} passed in.
parseString(string, handler[, error_handler])~
Similar to parse, but parses from a buffer {string} received as a
parameter.
A typical SAX application uses three kinds of objects: readers, handlers and
input sources. "Reader" in this context is another term for parser, i.e. some
piece of code that reads the bytes or characters from the input source, and
produces a sequence of events. The events then get distributed to the handler
objects, i.e. the reader invokes a method on the handler. A SAX application
must therefore obtain a reader object, create or open the input sources, create
the handlers, and connect these objects all together. As the final step of
preparation, the reader is called to parse the input. During parsing, methods on
the handler objects are called based on structural and syntactic events from the
input data.
For these objects, only the interfaces are relevant; they are normally not
instantiated by the application itself. Since Python does not have an explicit
notion of interface, they are formally introduced as classes, but applications
may use implementations which do not inherit from the provided classes. The
InputSource, Locator, Attributes,
AttributesNS, and XMLReader interfaces are defined in the
module xml.sax.xmlreader (|py2stdlib-xml.sax.xmlreader|). The handler interfaces are defined in
xml.sax.handler (|py2stdlib-xml.sax.handler|). For convenience, InputSource (which is often
instantiated directly) and the handler classes are also available from
xml.sax (|py2stdlib-xml.sax|). These interfaces are described below.
In addition to these classes, xml.sax (|py2stdlib-xml.sax|) provides the following exception
classes.
SAXException(msg[, exception])~
Encapsulate an XML error or warning. This class can contain basic error or
warning information from either the XML parser or the application: it can be
subclassed to provide additional functionality or to add localization. Note
that although the handlers defined in the ErrorHandler interface
receive instances of this exception, it is not required to actually raise the
exception --- it is also useful as a container for information.
When instantiated, {msg} should be a human-readable description of the error.
The optional {exception} parameter, if given, should be ``None`` or an exception
that was caught by the parsing code and is being passed along as information.
This is the base class for the other SAX exception classes.
SAXParseException(msg, exception, locator)~
Subclass of SAXException raised on parse errors. Instances of this class
are passed to the methods of the SAX ErrorHandler interface to provide
information about the parse error. This class supports the SAX Locator
interface as well as the SAXException interface.
SAXNotRecognizedException(msg[, exception])~
Subclass of SAXException raised when a SAX XMLReader is
confronted with an unrecognized feature or property. SAX applications and
extensions may use this class for similar purposes.
SAXNotSupportedException(msg[, exception])~
Subclass of SAXException raised when a SAX XMLReader is asked to
enable a feature that is not supported, or to set a property to a value that the
implementation does not support. SAX applications and extensions may use this
class for similar purposes.
.. seealso::
`SAX: The Simple API for XML <http://www.saxproject.org/>`_
This site is the focal point for the definition of the SAX API. It provides a
Java implementation and online documentation. Links to implementations and
historical information are also available.
Module xml.sax.handler (|py2stdlib-xml.sax.handler|)
Definitions of the interfaces for application-provided objects.
Module xml.sax.saxutils (|py2stdlib-xml.sax.saxutils|)
Convenience functions for use in SAX applications.
Module xml.sax.xmlreader (|py2stdlib-xml.sax.xmlreader|)
Definitions of the interfaces for parser-provided objects.
SAXException Objects
--------------------
The SAXException exception class supports the following methods:
SAXException.getMessage()~
Return a human-readable message describing the error condition.
SAXException.getException()~
Return an encapsulated exception object, or ``None``.
==============================================================================
*py2stdlib-xml.sax.saxutils*
xml.sax.saxutils~
:synopsis: Convenience functions and classes for use with SAX.
.. versionadded:: 2.0
The module xml.sax.saxutils (|py2stdlib-xml.sax.saxutils|) contains a number of classes and functions
that are commonly useful when creating SAX applications, either in direct use,
or as base classes.
escape(data[, entities])~
Escape ``'&'``, ``'<'``, and ``'>'`` in a string of data.
You can escape other strings of data by passing a dictionary as the optional
{entities} parameter. The keys and values must all be strings; each key will be
replaced with its corresponding value. The characters ``'&'``, ``'<'`` and
``'>'`` are always escaped, even if {entities} is provided.
unescape(data[, entities])~
Unescape ``'&amp;'``, ``'&lt;'``, and ``'&gt;'`` in a string of data.
You can unescape other strings of data by passing a dictionary as the optional
{entities} parameter. The keys and values must all be strings; each key will be
replaced with its corresponding value. ``'&amp'``, ``'&lt;'``, and ``'&gt;'``
are always unescaped, even if {entities} is provided.
.. versionadded:: 2.3
quoteattr(data[, entities])~
Similar to escape, but also prepares {data} to be used as an
attribute value. The return value is a quoted version of {data} with any
additional required replacements. quoteattr will select a quote
character based on the content of {data}, attempting to avoid encoding any
quote characters in the string. If both single- and double-quote characters
are already in {data}, the double-quote characters will be encoded and {data}
will be wrapped in double-quotes. The resulting string can be used directly
as an attribute value:: >
>>> print "<element attr=%s>" % quoteattr("ab ' cd \" ef")
<element attr="ab ' cd &quot; ef">
<
This function is useful when generating attribute values for HTML or any SGML
using the reference concrete syntax.
.. versionadded:: 2.2
XMLGenerator([out[, encoding]])~
This class implements the ContentHandler interface by writing SAX
events back into an XML document. In other words, using an XMLGenerator
as the content handler will reproduce the original document being parsed. {out}
should be a file-like object which will default to {sys.stdout}. {encoding} is
the encoding of the output stream which defaults to ``'iso-8859-1'``.
XMLFilterBase(base)~
This class is designed to sit between an XMLReader and the client
application's event handlers. By default, it does nothing but pass requests up
to the reader and events on to the handlers unmodified, but subclasses can
override specific methods to modify the event stream or the configuration
requests as they pass through.
prepare_input_source(source[, base])~
This function takes an input source and an optional base URL and returns a fully
resolved InputSource object ready for reading. The input source can be
given as a string, a file-like object, or an InputSource object;
parsers will use this function to implement the polymorphic {source} argument to
their parse method.
==============================================================================
*py2stdlib-xmllib*
xmllib~
:synopsis: A parser for XML documents.
:deprecated:
.. index::
single: XML
single: Extensible Markup Language
2.0~
Use xml.sax (|py2stdlib-xml.sax|) instead. The newer XML package includes full support for XML
1.0.
.. versionchanged:: 1.5.2
Added namespace support.
This module defines a class XMLParser which serves as the basis for
parsing text files formatted in XML (Extensible Markup Language).
XMLParser()~
The XMLParser class must be instantiated without arguments. [#]_
This class provides the following interface methods and instance variables:
attributes~
A mapping of element names to mappings. The latter mapping maps attribute
names that are valid for the element to the default value of the
attribute, or if there is no default to ``None``. The default value is
the empty dictionary. This variable is meant to be overridden, not
extended since the default is shared by all instances of
XMLParser.
elements~
A mapping of element names to tuples. The tuples contain a function for
handling the start and end tag respectively of the element, or ``None`` if
the method unknown_starttag or unknown_endtag is to be
called. The default value is the empty dictionary. This variable is
meant to be overridden, not extended since the default is shared by all
instances of XMLParser.
entitydefs~
A mapping of entitynames to their values. The default value contains
definitions for ``'lt'``, ``'gt'``, ``'amp'``, ``'quot'``, and ``'apos'``.
reset()~
Reset the instance. Loses all unprocessed data. This is called
implicitly at the instantiation time.
setnomoretags()~
Stop processing tags. Treat all following input as literal input (CDATA).
setliteral()~
Enter literal mode (CDATA mode). This mode is automatically exited when
the close tag matching the last unclosed open tag is encountered.
feed(data)~
Feed some text to the parser. It is processed insofar as it consists of
complete tags; incomplete data is buffered until more data is fed or
close is called.
close()~
Force processing of all buffered data as if it were followed by an
end-of-file mark. This method may be redefined by a derived class to
define additional processing at the end of the input, but the redefined
version should always call close.
translate_references(data)~
Translate all entity and character references in {data} and return the
translated string.
getnamespace()~
Return a mapping of namespace abbreviations to namespace URIs that are
currently in effect.
handle_xml(encoding, standalone)~
This method is called when the ``<?xml ...?>`` tag is processed. The
arguments are the values of the encoding and standalone attributes in the
tag. Both encoding and standalone are optional. The values passed to
handle_xml default to ``None`` and the string ``'no'``
respectively.
handle_doctype(tag, pubid, syslit, data)~
.. index::
single: DOCTYPE declaration
single: Formal Public Identifier
This method is called when the ``<!DOCTYPE...>`` declaration is processed.
The arguments are the tag name of the root element, the Formal Public
Identifier (or ``None`` if not specified), the system identifier, and the
uninterpreted contents of the internal DTD subset as a string (or ``None``
if not present).
handle_starttag(tag, method, attributes)~
This method is called to handle start tags for which a start tag handler
is defined in the instance variable elements. The {tag} argument
is the name of the tag, and the {method} argument is the function (method)
which should be used to support semantic interpretation of the start tag.
The {attributes} argument is a dictionary of attributes, the key being the
{name} and the value being the {value} of the attribute found inside the
tag's ``<>`` brackets. Character and entity references in the {value}
have been interpreted. For instance, for the start tag ``<A
HREF="http://www.cwi.nl/">``, this method would be called as
``handle_starttag('A', self.elements['A'][0], {'HREF':
'http://www.cwi.nl/'})``. The base implementation simply calls {method}
with {attributes} as the only argument.
handle_endtag(tag, method)~
This method is called to handle endtags for which an end tag handler is
defined in the instance variable elements. The {tag} argument is
the name of the tag, and the {method} argument is the function (method)
which should be used to support semantic interpretation of the end tag.
For instance, for the endtag ``</A>``, this method would be called as
``handle_endtag('A', self.elements['A'][1])``. The base implementation
simply calls {method}.
handle_data(data)~
This method is called to process arbitrary data. It is intended to be
overridden by a derived class; the base class implementation does nothing.
handle_charref(ref)~
This method is called to process a character reference of the form
``&#ref;``. {ref} can either be a decimal number, or a hexadecimal number
when preceded by an ``'x'``. In the base implementation, {ref} must be a
number in the range 0-255. It translates the character to ASCII and calls
the method handle_data with the character as argument. If {ref}
is invalid or out of range, the method ``unknown_charref(ref)`` is called
to handle the error. A subclass must override this method to provide
support for character references outside of the ASCII range.
handle_comment(comment)~
This method is called when a comment is encountered. The {comment}
argument is a string containing the text between the ``<!--`` and ``-->``
delimiters, but not the delimiters themselves. For example, the comment
``<!--text-->`` will cause this method to be called with the argument
``'text'``. The default method does nothing.
handle_cdata(data)~
This method is called when a CDATA element is encountered. The {data}
argument is a string containing the text between the ``<![CDATA[`` and
``]]>`` delimiters, but not the delimiters themselves. For example, the
entity ``<![CDATA[text]]>`` will cause this method to be called with the
argument ``'text'``. The default method does nothing, and is intended to
be overridden.
handle_proc(name, data)~
This method is called when a processing instruction (PI) is encountered.
The {name} is the PI target, and the {data} argument is a string
containing the text between the PI target and the closing delimiter, but
not the delimiter itself. For example, the instruction ``<?XML text?>``
will cause this method to be called with the arguments ``'XML'`` and
``'text'``. The default method does nothing. Note that if a document
starts with ``<?xml ..?>``, handle_xml is called to handle it.
handle_special(data)~
.. index:: single: ENTITY declaration
This method is called when a declaration is encountered. The {data}
argument is a string containing the text between the ``<!`` and ``>``
delimiters, but not the delimiters themselves. For example, the entity
declaration ``<!ENTITY text>`` will cause this method to be called with
the argument ``'ENTITY text'``. The default method does nothing. Note
that ``<!DOCTYPE ...>`` is handled separately if it is located at the
start of the document.
syntax_error(message)~
This method is called when a syntax error is encountered. The {message}
is a description of what was wrong. The default method raises a
RuntimeError exception. If this method is overridden, it is
permissible for it to return. This method is only called when the error
can be recovered from. Unrecoverable errors raise a RuntimeError
without first calling syntax_error.
unknown_starttag(tag, attributes)~
This method is called to process an unknown start tag. It is intended to
be overridden by a derived class; the base class implementation does nothing.
unknown_endtag(tag)~
This method is called to process an unknown end tag. It is intended to be
overridden by a derived class; the base class implementation does nothing.
unknown_charref(ref)~
This method is called to process unresolvable numeric character
references. It is intended to be overridden by a derived class; the base
class implementation does nothing.
unknown_entityref(ref)~
This method is called to process an unknown entity reference. It is
intended to be overridden by a derived class; the base class
implementation calls syntax_error to signal an error.
.. seealso::
`Extensible Markup Language (XML) 1.0 <http://www.w3.org/TR/REC-xml>`_
The XML specification, published by the World Wide Web Consortium (W3C), defines
the syntax and processor requirements for XML. References to additional
material on XML, including translations of the specification, are available at
http://www.w3.org/XML/.
`Python and XML Processing <http://www.python.org/topics/xml/>`_
The Python XML Topic Guide provides a great deal of information on using XML
from Python and links to other sources of information on XML.
`SIG for XML Processing in Python <http://www.python.org/sigs/xml-sig/>`_
The Python XML Special Interest Group is developing substantial support for
processing XML from Python.
XML Namespaces
--------------
.. index:: pair: XML; namespaces
This module has support for XML namespaces as defined in the XML Namespaces
proposed recommendation.
Tag and attribute names that are defined in an XML namespace are handled as if
the name of the tag or element consisted of the namespace (the URL that defines
the namespace) followed by a space and the name of the tag or attribute. For
instance, the tag ``<html xmlns='http://www.w3.org/TR/REC-html40'>`` is treated
as if the tag name was ``'http://www.w3.org/TR/REC-html40 html'``, and the tag
``<html:a href='http://frob.com'>`` inside the above mentioned element is
treated as if the tag name were ``'http://www.w3.org/TR/REC-html40 a'`` and the
attribute name as if it were ``'http://www.w3.org/TR/REC-html40 href'``.
An older draft of the XML Namespaces proposal is also recognized, but triggers a
warning.
.. seealso::
`Namespaces in XML <http://www.w3.org/TR/REC-xml-names/>`_
This World Wide Web Consortium recommendation describes the proper syntax and
processing requirements for namespaces in XML.
.. rubric:: Footnotes
.. [#] Actually, a number of keyword arguments are recognized which influence the
parser to accept certain non-standard constructs. The following keyword
arguments are currently recognized. The defaults for all of these is ``0``
(false) except for the last one for which the default is ``1`` (true).
{accept_unquoted_attributes} (accept certain attribute values without requiring
quotes), {accept_missing_endtag_name} (accept end tags that look like ``</>``),
{map_case} (map upper case to lower case in tags and attributes), {accept_utf8}
(allow UTF-8 characters in input; this is required according to the XML
standard, but Python does not as yet deal properly with these characters, so
this is not the default), {translate_attribute_references} (don't attempt to
translate character and entity references in attribute values).
==============================================================================
*py2stdlib-xmlrpclib*
xmlrpclib~
:synopsis: XML-RPC client access.
.. note::
The xmlrpclib (|py2stdlib-xmlrpclib|) module has been renamed to xmlrpc.client in
Python 3.0. The 2to3 tool will automatically adapt imports when
converting your sources to 3.0.
.. XXX Not everything is documented yet. It might be good to describe
Marshaller, Unmarshaller, getparser, dumps, loads, and Transport.
.. versionadded:: 2.2
XML-RPC is a Remote Procedure Call method that uses XML passed via HTTP as a
transport. With it, a client can call methods with parameters on a remote
server (the server is named by a URI) and get back structured data. This module
supports writing XML-RPC client code; it handles all the details of translating
between conformable Python objects and XML on the wire.
ServerProxy(uri[, transport[, encoding[, verbose[, allow_none[, use_datetime]]]]])~
A ServerProxy instance is an object that manages communication with a
remote XML-RPC server. The required first argument is a URI (Uniform Resource
Indicator), and will normally be the URL of the server. The optional second
argument is a transport factory instance; by default it is an internal
SafeTransport instance for https: URLs and an internal HTTP
Transport instance otherwise. The optional third argument is an
encoding, by default UTF-8. The optional fourth argument is a debugging flag.
If {allow_none} is true, the Python constant ``None`` will be translated into
XML; the default behaviour is for ``None`` to raise a TypeError. This is
a commonly-used extension to the XML-RPC specification, but isn't supported by
all clients and servers; see http://ontosys.com/xml-rpc/extensions.php for a
description. The {use_datetime} flag can be used to cause date/time values to
be presented as datetime.datetime objects; this is false by default.
datetime.datetime objects may be passed to calls.
Both the HTTP and HTTPS transports support the URL syntax extension for HTTP
Basic Authentication: ``http://user:pass@host:port/path``. The ``user:pass``
portion will be base64-encoded as an HTTP 'Authorization' header, and sent to
the remote server as part of the connection process when invoking an XML-RPC
method. You only need to use this if the remote server requires a Basic
Authentication user and password.
The returned instance is a proxy object with methods that can be used to invoke
corresponding RPC calls on the remote server. If the remote server supports the
introspection API, the proxy can also be used to query the remote server for the
methods it supports (service discovery) and fetch other server-associated
metadata.
ServerProxy instance methods take Python basic types and objects as
arguments and return Python basic types and classes. Types that are conformable
(e.g. that can be marshalled through XML), include the following (and except
where noted, they are unmarshalled as the same Python type):
+---------------------------------+---------------------------------------------+
| Name | Meaning |
+=================================+=============================================+
| boolean | The True and False |
| | constants |
+---------------------------------+---------------------------------------------+
| integers | Pass in directly |
+---------------------------------+---------------------------------------------+
| floating-point numbers | Pass in directly |
+---------------------------------+---------------------------------------------+
| strings | Pass in directly |
+---------------------------------+---------------------------------------------+
| arrays | Any Python sequence type containing |
| | conformable elements. Arrays are returned |
| | as lists |
+---------------------------------+---------------------------------------------+
| structures | A Python dictionary. Keys must be strings, |
| | values may be any conformable type. Objects |
| | of user-defined classes can be passed in; |
| | only their {__dict__} attribute is |
| | transmitted. |
+---------------------------------+---------------------------------------------+
| dates | in seconds since the epoch (pass in an |
| | instance of the DateTime class) or |
| | a datetime.datetime instance. |
+---------------------------------+---------------------------------------------+
| binary data | pass in an instance of the Binary |
| | wrapper class |
+---------------------------------+---------------------------------------------+
This is the full set of data types supported by XML-RPC. Method calls may also
raise a special Fault instance, used to signal XML-RPC server errors, or
ProtocolError used to signal an error in the HTTP/HTTPS transport layer.
Both Fault and ProtocolError derive from a base class called
Error. Note that even though starting with Python 2.2 you can subclass
built-in types, the xmlrpclib module currently does not marshal instances of such
subclasses.
When passing strings, characters special to XML such as ``<``, ``>``, and ``&``
will be automatically escaped. However, it's the caller's responsibility to
ensure that the string is free of characters that aren't allowed in XML, such as
the control characters with ASCII values between 0 and 31 (except, of course,
tab, newline and carriage return); failing to do this will result in an XML-RPC
request that isn't well-formed XML. If you have to pass arbitrary strings via
XML-RPC, use the Binary wrapper class described below.
Server is retained as an alias for ServerProxy for backwards
compatibility. New code should use ServerProxy.
.. versionchanged:: 2.5
The {use_datetime} flag was added.
.. versionchanged:: 2.6
Instances of new-style class\es can be passed in if they have an
{__dict__} attribute and don't have a base class that is marshalled in a
special way.
.. seealso::
`XML-RPC HOWTO <http://www.tldp.org/HOWTO/XML-RPC-HOWTO/index.html>`_
A good description of XML-RPC operation and client software in several languages.
Contains pretty much everything an XML-RPC client developer needs to know.
`XML-RPC Introspection <http://xmlrpc-c.sourceforge.net/introspection.html>`_
Describes the XML-RPC protocol extension for introspection.
`XML-RPC Specification <http://www.xmlrpc.com/spec>`_
The official specification.
`Unofficial XML-RPC Errata <http://effbot.org/zone/xmlrpc-errata.htm>`_
Fredrik Lundh's "unofficial errata, intended to clarify certain
details in the XML-RPC specification, as well as hint at
'best practices' to use when designing your own XML-RPC
implementations."
ServerProxy Objects
-------------------
A ServerProxy instance has a method corresponding to each remote
procedure call accepted by the XML-RPC server. Calling the method performs an
RPC, dispatched by both name and argument signature (e.g. the same method name
can be overloaded with multiple argument signatures). The RPC finishes by
returning a value, which may be either returned data in a conformant type or a
Fault or ProtocolError object indicating an error.
Servers that support the XML introspection API support some common methods
grouped under the reserved system member:
ServerProxy.system.listMethods()~
This method returns a list of strings, one for each (non-system) method
supported by the XML-RPC server.
ServerProxy.system.methodSignature(name)~
This method takes one parameter, the name of a method implemented by the XML-RPC
server. It returns an array of possible signatures for this method. A signature
is an array of types. The first of these types is the return type of the method,
the rest are parameters.
Because multiple signatures (ie. overloading) is permitted, this method returns
a list of signatures rather than a singleton.
Signatures themselves are restricted to the top level parameters expected by a
method. For instance if a method expects one array of structs as a parameter,
and it returns a string, its signature is simply "string, array". If it expects
three integers and returns a string, its signature is "string, int, int, int".
If no signature is defined for the method, a non-array value is returned. In
Python this means that the type of the returned value will be something other
than list.
ServerProxy.system.methodHelp(name)~
This method takes one parameter, the name of a method implemented by the XML-RPC
server. It returns a documentation string describing the use of that method. If
no such string is available, an empty string is returned. The documentation
string may contain HTML markup.
Boolean Objects
---------------
This class may be initialized from any Python value; the instance returned
depends only on its truth value. It supports various Python operators through
__cmp__, __repr__, __int__, and __nonzero__
methods, all implemented in the obvious ways.
It also has the following method, supported mainly for internal use by the
unmarshalling code:
Boolean.encode(out)~
Write the XML-RPC encoding of this Boolean item to the out stream object.
A working example follows. The server code:: >
import xmlrpclib
from SimpleXMLRPCServer import SimpleXMLRPCServer
def is_even(n):
return n%2 == 0
server = SimpleXMLRPCServer(("localhost", 8000))
print "Listening on port 8000..."
server.register_function(is_even, "is_even")
server.serve_forever()
<
The client code for the preceding server::
import xmlrpclib
proxy = xmlrpclib.ServerProxy("http://localhost:8000/")
print "3 is even: %s" % str(proxy.is_even(3))
print "100 is even: %s" % str(proxy.is_even(100))
DateTime Objects
----------------
This class may be initialized with seconds since the epoch, a time
tuple, an ISO 8601 time/date string, or a datetime.datetime
instance. It has the following methods, supported mainly for internal
use by the marshalling/unmarshalling code:
DateTime.decode(string)~
Accept a string as the instance's new time value.
DateTime.encode(out)~
Write the XML-RPC encoding of this DateTime item to the {out} stream
object.
It also supports certain of Python's built-in operators through __cmp__
and __repr__ methods.
A working example follows. The server code:: >
import datetime
from SimpleXMLRPCServer import SimpleXMLRPCServer
import xmlrpclib
def today():
today = datetime.datetime.today()
return xmlrpclib.DateTime(today)
server = SimpleXMLRPCServer(("localhost", 8000))
print "Listening on port 8000..."
server.register_function(today, "today")
server.serve_forever()
<
The client code for the preceding server::
import xmlrpclib
import datetime
proxy = xmlrpclib.ServerProxy("http://localhost:8000/")
today = proxy.today()
# convert the ISO8601 string to a datetime object
converted = datetime.datetime.strptime(today.value, "%Y%m%dT%H:%M:%S")
print "Today: %s" % converted.strftime("%d.%m.%Y, %H:%M")
Binary Objects
--------------
This class may be initialized from string data (which may include NULs). The
primary access to the content of a Binary object is provided by an
attribute:
Binary.data~
The binary data encapsulated by the Binary instance. The data is
provided as an 8-bit string.
Binary objects have the following methods, supported mainly for
internal use by the marshalling/unmarshalling code:
Binary.decode(string)~
Accept a base64 string and decode it as the instance's new data.
Binary.encode(out)~
Write the XML-RPC base 64 encoding of this binary item to the out stream object.
The encoded data will have newlines every 76 characters as per
`RFC 2045 section 6.8 <http://tools.ietf.org/html/rfc2045#section-6.8>`_,
which was the de facto standard base64 specification when the
XML-RPC spec was written.
It also supports certain of Python's built-in operators through a
__cmp__ method.
Example usage of the binary objects. We're going to transfer an image over
XMLRPC:: >
from SimpleXMLRPCServer import SimpleXMLRPCServer
import xmlrpclib
def python_logo():
with open("python_logo.jpg", "rb") as handle:
return xmlrpclib.Binary(handle.read())
server = SimpleXMLRPCServer(("localhost", 8000))
print "Listening on port 8000..."
server.register_function(python_logo, 'python_logo')
server.serve_forever()
<
The client gets the image and saves it to a file::
import xmlrpclib
proxy = xmlrpclib.ServerProxy("http://localhost:8000/")
with open("fetched_python_logo.jpg", "wb") as handle:
handle.write(proxy.python_logo().data)
Fault Objects
-------------
A Fault object encapsulates the content of an XML-RPC fault tag. Fault
objects have the following members:
Fault.faultCode~
A string indicating the fault type.
Fault.faultString~
A string containing a diagnostic message associated with the fault.
In the following example we're going to intentionally cause a Fault by
returning a complex type object. The server code:: >
from SimpleXMLRPCServer import SimpleXMLRPCServer
# A marshalling error is going to occur because we're returning a
# complex number
def add(x,y):
return x+y+0j
server = SimpleXMLRPCServer(("localhost", 8000))
print "Listening on port 8000..."
server.register_function(add, 'add')
server.serve_forever()
<
The client code for the preceding server::
import xmlrpclib
proxy = xmlrpclib.ServerProxy("http://localhost:8000/")
try:
proxy.add(2, 5)
except xmlrpclib.Fault, err:
print "A fault occurred"
print "Fault code: %d" % err.faultCode
print "Fault string: %s" % err.faultString
ProtocolError Objects
---------------------
A ProtocolError object describes a protocol error in the underlying
transport layer (such as a 404 'not found' error if the server named by the URI
does not exist). It has the following members:
ProtocolError.url~
The URI or URL that triggered the error.
ProtocolError.errcode~
The error code.
ProtocolError.errmsg~
The error message or diagnostic string.
ProtocolError.headers~
A string containing the headers of the HTTP/HTTPS request that triggered the
error.
In the following example we're going to intentionally cause a ProtocolError
by providing an URI that doesn't point to an XMLRPC server:: >
import xmlrpclib
# create a ServerProxy with an URI that doesn't respond to XMLRPC requests
proxy = xmlrpclib.ServerProxy("http://www.google.com/")
try:
proxy.some_method()
except xmlrpclib.ProtocolError, err:
print "A protocol error occurred"
print "URL: %s" % err.url
print "HTTP/HTTPS headers: %s" % err.headers
print "Error code: %d" % err.errcode
print "Error message: %s" % err.errmsg
<
MultiCall Objects
.. versionadded:: 2.4
In http://www.xmlrpc.com/discuss/msgReader%241208, an approach is presented to
encapsulate multiple calls to a remote server into a single request.
MultiCall(server)~
Create an object used to boxcar method calls. {server} is the eventual target of
the call. Calls can be made to the result object, but they will immediately
return ``None``, and only store the call name and parameters in the
MultiCall object. Calling the object itself causes all stored calls to
be transmitted as a single ``system.multicall`` request. The result of this call
is a generator; iterating over this generator yields the individual
results.
A usage example of this class follows. The server code :: >
from SimpleXMLRPCServer import SimpleXMLRPCServer
def add(x,y):
return x+y
def subtract(x, y):
return x-y
def multiply(x, y):
return x*y
def divide(x, y):
return x/y
# A simple server with simple arithmetic functions
server = SimpleXMLRPCServer(("localhost", 8000))
print "Listening on port 8000..."
server.register_multicall_functions()
server.register_function(add, 'add')
server.register_function(subtract, 'subtract')
server.register_function(multiply, 'multiply')
server.register_function(divide, 'divide')
server.serve_forever()
<
The client code for the preceding server::
import xmlrpclib
proxy = xmlrpclib.ServerProxy("http://localhost:8000/")
multicall = xmlrpclib.MultiCall(proxy)
multicall.add(7,3)
multicall.subtract(7,3)
multicall.multiply(7,3)
multicall.divide(7,3)
result = multicall()
print "7+3=%d, 7-3=%d, 7*3=%d, 7/3=%d" % tuple(result)
Convenience Functions
---------------------
boolean(value)~
Convert any Python value to one of the XML-RPC Boolean constants, ``True`` or
``False``.
dumps(params[, methodname[, methodresponse[, encoding[, allow_none]]]])~
Convert {params} into an XML-RPC request. or into a response if {methodresponse}
is true. {params} can be either a tuple of arguments or an instance of the
Fault exception class. If {methodresponse} is true, only a single value
can be returned, meaning that {params} must be of length 1. {encoding}, if
supplied, is the encoding to use in the generated XML; the default is UTF-8.
Python's None value cannot be used in standard XML-RPC; to allow using
it via an extension, provide a true value for {allow_none}.
loads(data[, use_datetime])~
Convert an XML-RPC request or response into Python objects, a ``(params,
methodname)``. {params} is a tuple of argument; {methodname} is a string, or
``None`` if no method name is present in the packet. If the XML-RPC packet
represents a fault condition, this function will raise a Fault exception.
The {use_datetime} flag can be used to cause date/time values to be presented as
datetime.datetime objects; this is false by default.
.. versionchanged:: 2.5
The {use_datetime} flag was added.
Example of Client Usage
-----------------------
:: >
# simple test program (from the XML-RPC specification)
from xmlrpclib import ServerProxy, Error
# server = ServerProxy("http://localhost:8000") # local server
server = ServerProxy("http://betty.userland.com")
print server
try:
print server.examples.getStateName(41)
except Error, v:
print "ERROR", v
<
To access an XML-RPC server through a proxy, you need to define a custom
transport. The following example shows how:
.. Example taken from http://lowlife.jp/nobonobo/wiki/xmlrpcwithproxy.html
:: >
import xmlrpclib, httplib
class ProxiedTransport(xmlrpclib.Transport):
def set_proxy(self, proxy):
self.proxy = proxy
def make_connection(self, host):
self.realhost = host
h = httplib.HTTP(self.proxy)
return h
def send_request(self, connection, handler, request_body):
connection.putrequest("POST", 'http://%s%s' % (self.realhost, handler))
def send_host(self, connection, host):
connection.putheader('Host', self.realhost)
p = ProxiedTransport()
p.set_proxy('proxy-server:8080')
server = xmlrpclib.Server('http://time.xmlrpc.com/RPC2', transport=p)
print server.currentTime.getCurrentTime()
<
Example of Client and Server Usage
See simplexmlrpcserver-example.
==============================================================================
*py2stdlib-zipfile*
zipfile~
:synopsis: Read and write ZIP-format archive files.
.. versionadded:: 1.6
The ZIP file format is a common archive and compression standard. This module
provides tools to create, read, write, append, and list a ZIP file. Any
advanced use of this module will require an understanding of the format, as
defined in `PKZIP Application Note
<http://www.pkware.com/documents/casestudies/APPNOTE.TXT>`_.
This module does not currently handle multi-disk ZIP files, or ZIP files
which have appended comments (although it correctly handles comments
added to individual archive members---for which see the zipinfo-objects
documentation). It can handle ZIP files that use the ZIP64 extensions
(that is ZIP files that are more than 4 GByte in size). It supports
decryption of encrypted files in ZIP archives, but it currently cannot
create an encrypted file. Decryption is extremely slow as it is
implemented in native Python rather than C.
For other archive formats, see the bz2 (|py2stdlib-bz2|), gzip (|py2stdlib-gzip|), and
tarfile (|py2stdlib-tarfile|) modules.
The module defines the following items:
BadZipfile~
The error raised for bad ZIP files (old name: ``zipfile.error``).
LargeZipFile~
The error raised when a ZIP file would require ZIP64 functionality but that has
not been enabled.
ZipFile~
The class for reading and writing ZIP files. See section
zipfile-objects for constructor details.
PyZipFile~
Class for creating ZIP archives containing Python libraries.
ZipInfo([filename[, date_time]])~
Class used to represent information about a member of an archive. Instances
of this class are returned by the getinfo and infolist
methods of ZipFile objects. Most users of the zipfile (|py2stdlib-zipfile|) module
will not need to create these, but only use those created by this
module. {filename} should be the full name of the archive member, and
{date_time} should be a tuple containing six fields which describe the time
of the last modification to the file; the fields are described in section
zipinfo-objects.
is_zipfile(filename)~
Returns ``True`` if {filename} is a valid ZIP file based on its magic number,
otherwise returns ``False``. {filename} may be a file or file-like object too.
This module does not currently handle ZIP files which have appended comments.
.. versionchanged:: 2.7
Support for file and file-like objects.
ZIP_STORED~
The numeric constant for an uncompressed archive member.
ZIP_DEFLATED~
The numeric constant for the usual ZIP compression method. This requires the
zlib module. No other compression methods are currently supported.
.. seealso::
`PKZIP Application Note <http://www.pkware.com/documents/casestudies/APPNOTE.TXT>`_
Documentation on the ZIP file format by Phil Katz, the creator of the format and
algorithms used.
`Info-ZIP Home Page <http://www.info-zip.org/>`_
Information about the Info-ZIP project's ZIP archive programs and development
libraries.
ZipFile Objects
---------------
ZipFile(file[, mode[, compression[, allowZip64]]])~
Open a ZIP file, where {file} can be either a path to a file (a string) or a
file-like object. The {mode} parameter should be ``'r'`` to read an existing
file, ``'w'`` to truncate and write a new file, or ``'a'`` to append to an
existing file. If {mode} is ``'a'`` and {file} refers to an existing ZIP
file, then additional files are added to it. If {file} does not refer to a
ZIP file, then a new ZIP archive is appended to the file. This is meant for
adding a ZIP archive to another file (such as python.exe).
.. versionchanged:: 2.6
If {mode} is ``a`` and the file does not exist at all, it is created.
{compression} is the ZIP compression method to use when writing the archive,
and should be ZIP_STORED or ZIP_DEFLATED; unrecognized
values will cause RuntimeError to be raised. If ZIP_DEFLATED
is specified but the zlib (|py2stdlib-zlib|) module is not available, RuntimeError
is also raised. The default is ZIP_STORED. If {allowZip64} is
``True`` zipfile will create ZIP files that use the ZIP64 extensions when
the zipfile is larger than 2 GB. If it is false (the default) zipfile (|py2stdlib-zipfile|)
will raise an exception when the ZIP file would require ZIP64 extensions.
ZIP64 extensions are disabled by default because the default zip
and unzip commands on Unix (the InfoZIP utilities) don't support
these extensions.
ZipFile is also a context manager and therefore supports the
with statement. In the example, {myzip} is closed after the
with statement's suite is finished---even if an exception occurs:: >
with ZipFile('spam.zip', 'w') as myzip:
myzip.write('eggs.txt')
<
.. versionadded:: 2.7
Added the ability to use ZipFile as a context manager.
ZipFile.close()~
Close the archive file. You must call close before exiting your program
or essential records will not be written.
ZipFile.getinfo(name)~
Return a ZipInfo object with information about the archive member
{name}. Calling getinfo for a name not currently contained in the
archive will raise a KeyError.
ZipFile.infolist()~
Return a list containing a ZipInfo object for each member of the
archive. The objects are in the same order as their entries in the actual ZIP
file on disk if an existing archive was opened.
ZipFile.namelist()~
Return a list of archive members by name.
ZipFile.open(name[, mode[, pwd]])~
Extract a member from the archive as a file-like object (ZipExtFile). {name} is
the name of the file in the archive, or a ZipInfo object. The {mode}
parameter, if included, must be one of the following: ``'r'`` (the default),
``'U'``, or ``'rU'``. Choosing ``'U'`` or ``'rU'`` will enable universal newline
support in the read-only object. {pwd} is the password used for encrypted files.
Calling open on a closed ZipFile will raise a RuntimeError.
.. note:: >
The file-like object is read-only and provides the following methods:
read, readline (|py2stdlib-readline|), readlines, __iter__,
next.
<
.. note::
If the ZipFile was created by passing in a file-like object as the first
argument to the constructor, then the object returned by .open shares the
ZipFile's file pointer. Under these circumstances, the object returned by
.open should not be used after any additional operations are performed
on the ZipFile object. If the ZipFile was created by passing in a string (the
filename) as the first argument to the constructor, then .open will
create a new file object that will be held by the ZipExtFile, allowing it to
operate independently of the ZipFile.
.. note:: >
The open, read and extract methods can take a filename
or a ZipInfo object. You will appreciate this when trying to read a
ZIP file that contains members with duplicate names.
<
.. versionadded:: 2.6
ZipFile.extract(member[, path[, pwd]])~
Extract a member from the archive to the current working directory; {member}
must be its full name or a ZipInfo object). Its file information is
extracted as accurately as possible. {path} specifies a different directory
to extract to. {member} can be a filename or a ZipInfo object.
{pwd} is the password used for encrypted files.
.. versionadded:: 2.6
ZipFile.extractall([path[, members[, pwd]]])~
Extract all members from the archive to the current working directory. {path}
specifies a different directory to extract to. {members} is optional and must
be a subset of the list returned by namelist. {pwd} is the password
used for encrypted files.
.. warning:: >
Never extract archives from untrusted sources without prior inspection.
It is possible that files are created outside of {path}, e.g. members
that have absolute filenames starting with ``"/"`` or filenames with two
dots ``".."``.
<
.. versionadded:: 2.6
ZipFile.printdir()~
Print a table of contents for the archive to ``sys.stdout``.
ZipFile.setpassword(pwd)~
Set {pwd} as default password to extract encrypted files.
.. versionadded:: 2.6
ZipFile.read(name[, pwd])~
Return the bytes of the file {name} in the archive. {name} is the name of the
file in the archive, or a ZipInfo object. The archive must be open for
read or append. {pwd} is the password used for encrypted files and, if specified,
it will override the default password set with setpassword. Calling
read on a closed ZipFile will raise a RuntimeError.
.. versionchanged:: 2.6
{pwd} was added, and {name} can now be a ZipInfo object.
ZipFile.testzip()~
Read all the files in the archive and check their CRC's and file headers.
Return the name of the first bad file, or else return ``None``. Calling
testzip on a closed ZipFile will raise a RuntimeError.
ZipFile.write(filename[, arcname[, compress_type]])~
Write the file named {filename} to the archive, giving it the archive name
{arcname} (by default, this will be the same as {filename}, but without a drive
letter and with leading path separators removed). If given, {compress_type}
overrides the value given for the {compression} parameter to the constructor for
the new entry. The archive must be open with mode ``'w'`` or ``'a'`` -- calling
write on a ZipFile created with mode ``'r'`` will raise a
RuntimeError. Calling write on a closed ZipFile will raise a
RuntimeError.
.. note:: >
There is no official file name encoding for ZIP files. If you have unicode file
names, you must convert them to byte strings in your desired encoding before
passing them to write. WinZip interprets all file names as encoded in
CP437, also known as DOS Latin.
<
.. note::
Archive names should be relative to the archive root, that is, they should not
start with a path separator.
.. note:: >
If ``arcname`` (or ``filename``, if ``arcname`` is not given) contains a null
byte, the name of the file in the archive will be truncated at the null byte.
<
ZipFile.writestr(zinfo_or_arcname, bytes[, compress_type])~
Write the string {bytes} to the archive; {zinfo_or_arcname} is either the file
name it will be given in the archive, or a ZipInfo instance. If it's
an instance, at least the filename, date, and time must be given. If it's a
name, the date and time is set to the current date and time. The archive must be
opened with mode ``'w'`` or ``'a'`` -- calling writestr on a ZipFile
created with mode ``'r'`` will raise a RuntimeError. Calling
writestr on a closed ZipFile will raise a RuntimeError.
If given, {compress_type} overrides the value given for the {compression}
parameter to the constructor for the new entry, or in the {zinfo_or_arcname}
(if that is a ZipInfo instance).
.. note:: >
When passing a ZipInfo instance as the {zinfo_or_acrname} parameter,
the compression method used will be that specified in the {compress_type}
member of the given ZipInfo instance. By default, the
ZipInfo constructor sets this member to ZIP_STORED.
<
.. versionchanged:: 2.7
The {compression_type} argument.
The following data attributes are also available:
ZipFile.debug~
The level of debug output to use. This may be set from ``0`` (the default, no
output) to ``3`` (the most output). Debugging information is written to
``sys.stdout``.
ZipFile.comment~
The comment text associated with the ZIP file. If assigning a comment to a
ZipFile instance created with mode 'a' or 'w', this should be a
string no longer than 65535 bytes. Comments longer than this will be
truncated in the written archive when ZipFile.close is called.
PyZipFile Objects
-----------------
The PyZipFile constructor takes the same parameters as the
ZipFile constructor. Instances have one method in addition to those of
ZipFile objects.
PyZipFile.writepy(pathname[, basename])~
Search for files \*.py and add the corresponding file to the archive.
The corresponding file is a \*.pyo file if available, else a
\*.pyc file, compiling if necessary. If the pathname is a file, the
filename must end with .py, and just the (corresponding
\*.py[co]) file is added at the top level (no path information). If the
pathname is a file that does not end with .py, a RuntimeError
will be raised. If it is a directory, and the directory is not a package
directory, then all the files \*.py[co] are added at the top level. If
the directory is a package directory, then all \*.py[co] are added under
the package name as a file path, and if any subdirectories are package
directories, all of these are added recursively. {basename} is intended for
internal use only. The writepy method makes archives with file names
like this:: >
string.pyc # Top level name
test/__init__.pyc # Package directory
test/test_support.pyc # Module test.test_support
test/bogus/__init__.pyc # Subpackage directory
test/bogus/myfile.pyc # Submodule test.bogus.myfile
<
ZipInfo Objects
Instances of the ZipInfo class are returned by the getinfo and
infolist methods of ZipFile objects. Each object stores
information about a single member of the ZIP archive.
Instances have the following attributes:
ZipInfo.filename~
Name of the file in the archive.
ZipInfo.date_time~
The time and date of the last modification to the archive member. This is a
tuple of six values:
+-------+--------------------------+
| Index | Value |
+=======+==========================+
| ``0`` | Year |
+-------+--------------------------+
| ``1`` | Month (one-based) |
+-------+--------------------------+
| ``2`` | Day of month (one-based) |
+-------+--------------------------+
| ``3`` | Hours (zero-based) |
+-------+--------------------------+
| ``4`` | Minutes (zero-based) |
+-------+--------------------------+
| ``5`` | Seconds (zero-based) |
+-------+--------------------------+
ZipInfo.compress_type~
Type of compression for the archive member.
ZipInfo.comment~
Comment for the individual archive member.
ZipInfo.extra~
Expansion field data. The `PKZIP Application Note
<http://www.pkware.com/documents/casestudies/APPNOTE.TXT>`_ contains
some comments on the internal structure of the data contained in this string.
ZipInfo.create_system~
System which created ZIP archive.
ZipInfo.create_version~
PKZIP version which created ZIP archive.
ZipInfo.extract_version~
PKZIP version needed to extract archive.
ZipInfo.reserved~
Must be zero.
ZipInfo.flag_bits~
ZIP flag bits.
ZipInfo.volume~
Volume number of file header.
ZipInfo.internal_attr~
Internal attributes.
ZipInfo.external_attr~
External file attributes.
ZipInfo.header_offset~
Byte offset to the file header.
ZipInfo.CRC~
CRC-32 of the uncompressed file.
ZipInfo.compress_size~
Size of the compressed data.
ZipInfo.file_size~
Size of the uncompressed file.
==============================================================================
*py2stdlib-zipimport*
zipimport~
:synopsis: support for importing Python modules from ZIP archives.
.. versionadded:: 2.3
This module adds the ability to import Python modules (\*.py,
\*.py[co]) and packages from ZIP-format archives. It is usually not
needed to use the zipimport (|py2stdlib-zipimport|) module explicitly; it is automatically used
by the built-in import mechanism for ``sys.path`` items that are paths
to ZIP archives.
Typically, ``sys.path`` is a list of directory names as strings. This module
also allows an item of ``sys.path`` to be a string naming a ZIP file archive.
The ZIP archive can contain a subdirectory structure to support package imports,
and a path within the archive can be specified to only import from a
subdirectory. For example, the path /tmp/example.zip/lib/ would only
import from the lib/ subdirectory within the archive.
Any files may be present in the ZIP archive, but only files .py and
.py[co] are available for import. ZIP import of dynamic modules
(.pyd, .so) is disallowed. Note that if an archive only contains
.py files, Python will not attempt to modify the archive by adding the
corresponding .pyc or .pyo file, meaning that if a ZIP archive
doesn't contain .pyc files, importing may be rather slow.
Using the built-in reload function will fail if called on a module
loaded from a ZIP archive; it is unlikely that reload would be needed,
since this would imply that the ZIP has been altered during runtime.
ZIP archives with an archive comment are currently not supported.
.. seealso::
`PKZIP Application Note <http://www.pkware.com/documents/casestudies/APPNOTE.TXT>`_
Documentation on the ZIP file format by Phil Katz, the creator of the format and
algorithms used.
273 - Import Modules from Zip Archives
Written by James C. Ahlstrom, who also provided an implementation. Python 2.3
follows the specification in PEP 273, but uses an implementation written by Just
van Rossum that uses the import hooks described in PEP 302.
302 - New Import Hooks
The PEP to add the import hooks that help this module work.
This module defines an exception:
ZipImportError~
Exception raised by zipimporter objects. It's a subclass of ImportError,
so it can be caught as ImportError, too.
zipimporter Objects
-------------------
zipimporter is the class for importing ZIP files.
zipimporter(archivepath)~
Create a new zipimporter instance. {archivepath} must be a path to a ZIP
file, or to a specific path within a ZIP file. For example, an {archivepath}
of foo/bar.zip/lib will look for modules in the lib directory
inside the ZIP file foo/bar.zip (provided that it exists).
ZipImportError is raised if {archivepath} doesn't point to a valid ZIP
archive.
find_module(fullname[, path])~
Search for a module specified by {fullname}. {fullname} must be the fully
qualified (dotted) module name. It returns the zipimporter instance itself
if the module was found, or None if it wasn't. The optional
{path} argument is ignored---it's there for compatibility with the
importer protocol.
get_code(fullname)~
Return the code object for the specified module. Raise
ZipImportError if the module couldn't be found.
get_data(pathname)~
Return the data associated with {pathname}. Raise IOError if the
file wasn't found.
get_filename(fullname)~
Return the value ``__file__`` would be set to if the specified module
was imported. Raise ZipImportError if the module couldn't be
found.
.. versionadded:: 2.7
get_source(fullname)~
Return the source code for the specified module. Raise
ZipImportError if the module couldn't be found, return
None if the archive does contain the module, but has no source
for it.
is_package(fullname)~
Return True if the module specified by {fullname} is a package. Raise
ZipImportError if the module couldn't be found.
load_module(fullname)~
Load the module specified by {fullname}. {fullname} must be the fully
qualified (dotted) module name. It returns the imported module, or raises
ZipImportError if it wasn't found.
archive~
The file name of the importer's associated ZIP file, without a possible
subpath.
prefix~
The subpath within the ZIP file where modules are searched. This is the
empty string for zipimporter objects which point to the root of the ZIP
file.
The archive and prefix attributes, when combined with a
slash, equal the original {archivepath} argument given to the
zipimporter constructor.
Examples
--------
Here is an example that imports a module from a ZIP archive - note that the
zipimport (|py2stdlib-zipimport|) module is not explicitly used. :: >
$ unzip -l /tmp/example.zip
Archive: /tmp/example.zip
Length Date Time Name
-------- ---- ---- ----
8467 11-26-02 22:30 jwzthreading.py
-------- -------
8467 1 file
$ ./python
Python 2.3 (#1, Aug 1 2003, 19:54:32)
>>> import sys
>>> sys.path.insert(0, '/tmp/example.zip') # Add .zip file to front of path
>>> import jwzthreading
>>> jwzthreading.__file__
'/tmp/example.zip/jwzthreading.py'
==============================================================================
*py2stdlib-zlib*
zlib~
:synopsis: Low-level interface to compression and decompression routines compatible with
gzip.
For applications that require data compression, the functions in this module
allow compression and decompression, using the zlib library. The zlib library
has its own home page at http://www.zlib.net. There are known
incompatibilities between the Python module and versions of the zlib library
earlier than 1.1.3; 1.1.3 has a security vulnerability, so we recommend using
1.1.4 or later.
zlib's functions have many options and often need to be used in a particular
order. This documentation doesn't attempt to cover all of the permutations;
consult the zlib manual at http://www.zlib.net/manual.html for authoritative
information.
For reading and writing ``.gz`` files see the gzip (|py2stdlib-gzip|) module. For
other archive formats, see the bz2 (|py2stdlib-bz2|), zipfile (|py2stdlib-zipfile|), and
tarfile (|py2stdlib-tarfile|) modules.
The available exception and functions in this module are:
error~
Exception raised on compression and decompression errors.
adler32(data[, value])~
Computes a Adler-32 checksum of {data}. (An Adler-32 checksum is almost as
reliable as a CRC32 but can be computed much more quickly.) If {value} is
present, it is used as the starting value of the checksum; otherwise, a fixed
default value is used. This allows computing a running checksum over the
concatenation of several inputs. The algorithm is not cryptographically
strong, and should not be used for authentication or digital signatures. Since
the algorithm is designed for use as a checksum algorithm, it is not suitable
for use as a general hash algorithm.
This function always returns an integer object.
.. note::
To generate the same numeric value across all Python versions and
platforms use adler32(data) & 0xffffffff. If you are only using
the checksum in packed binary format this is not necessary as the
return value is the correct 32bit binary representation
regardless of sign.
.. versionchanged:: 2.6
The return value is in the range [-2{31, 2}*31-1]
regardless of platform. In older versions the value is
signed on some platforms and unsigned on others.
.. versionchanged:: 3.0
The return value is unsigned and in the range [0, 2{}32-1]
regardless of platform.
compress(string[, level])~
Compresses the data in {string}, returning a string contained compressed data.
{level} is an integer from ``1`` to ``9`` controlling the level of compression;
``1`` is fastest and produces the least compression, ``9`` is slowest and
produces the most. The default value is ``6``. Raises the error
exception if any error occurs.
compressobj([level])~
Returns a compression object, to be used for compressing data streams that won't
fit into memory at once. {level} is an integer from ``1`` to ``9`` controlling
the level of compression; ``1`` is fastest and produces the least compression,
``9`` is slowest and produces the most. The default value is ``6``.
crc32(data[, value])~
.. index::
single: Cyclic Redundancy Check
single: checksum; Cyclic Redundancy Check
Computes a CRC (Cyclic Redundancy Check) checksum of {data}. If {value} is
present, it is used as the starting value of the checksum; otherwise, a fixed
default value is used. This allows computing a running checksum over the
concatenation of several inputs. The algorithm is not cryptographically
strong, and should not be used for authentication or digital signatures. Since
the algorithm is designed for use as a checksum algorithm, it is not suitable
for use as a general hash algorithm.
This function always returns an integer object.
.. note::
To generate the same numeric value across all Python versions and
platforms use crc32(data) & 0xffffffff. If you are only using
the checksum in packed binary format this is not necessary as the
return value is the correct 32bit binary representation
regardless of sign.
.. versionchanged:: 2.6
The return value is in the range [-2{31, 2}*31-1]
regardless of platform. In older versions the value would be
signed on some platforms and unsigned on others.
.. versionchanged:: 3.0
The return value is unsigned and in the range [0, 2{}32-1]
regardless of platform.
decompress(string[, wbits[, bufsize]])~
Decompresses the data in {string}, returning a string containing the
uncompressed data. The {wbits} parameter controls the size of the window
buffer, and is discussed further below.
If {bufsize} is given, it is used as the initial size of the output
buffer. Raises the error exception if any error occurs.
The absolute value of {wbits} is the base two logarithm of the size of the
history buffer (the "window size") used when compressing data. Its absolute
value should be between 8 and 15 for the most recent versions of the zlib
library, larger values resulting in better compression at the expense of greater
memory usage. When decompressing a stream, {wbits} must not be smaller
than the size originally used to compress the stream; using a too-small
value will result in an exception. The default value is therefore the
highest value, 15. When {wbits} is negative, the standard
gzip (|py2stdlib-gzip|) header is suppressed.
{bufsize} is the initial size of the buffer used to hold decompressed data. If
more space is required, the buffer size will be increased as needed, so you
don't have to get this value exactly right; tuning it will only save a few calls
to malloc. The default size is 16384.
decompressobj([wbits])~
Returns a decompression object, to be used for decompressing data streams that
won't fit into memory at once. The {wbits} parameter controls the size of the
window buffer.
Compression objects support the following methods:
Compress.compress(string)~
Compress {string}, returning a string containing compressed data for at least
part of the data in {string}. This data should be concatenated to the output
produced by any preceding calls to the compress method. Some input may
be kept in internal buffers for later processing.
Compress.flush([mode])~
All pending input is processed, and a string containing the remaining compressed
output is returned. {mode} can be selected from the constants
Z_SYNC_FLUSH, Z_FULL_FLUSH, or Z_FINISH,
defaulting to Z_FINISH. Z_SYNC_FLUSH and
Z_FULL_FLUSH allow compressing further strings of data, while
Z_FINISH finishes the compressed stream and prevents compressing any
more data. After calling flush with {mode} set to Z_FINISH,
the compress method cannot be called again; the only realistic action is
to delete the object.
Compress.copy()~
Returns a copy of the compression object. This can be used to efficiently
compress a set of data that share a common initial prefix.
.. versionadded:: 2.5
Decompression objects support the following methods, and two attributes:
Decompress.unused_data~
A string which contains any bytes past the end of the compressed data. That is,
this remains ``""`` until the last byte that contains compression data is
available. If the whole string turned out to contain compressed data, this is
``""``, the empty string.
The only way to determine where a string of compressed data ends is by actually
decompressing it. This means that when compressed data is contained part of a
larger file, you can only find the end of it by reading data and feeding it
followed by some non-empty string into a decompression object's
decompress method until the unused_data attribute is no longer
the empty string.
Decompress.unconsumed_tail~
A string that contains any data that was not consumed by the last
decompress call because it exceeded the limit for the uncompressed data
buffer. This data has not yet been seen by the zlib machinery, so you must feed
it (possibly with further data concatenated to it) back to a subsequent
decompress method call in order to get correct output.
Decompress.decompress(string[, max_length])~
Decompress {string}, returning a string containing the uncompressed data
corresponding to at least part of the data in {string}. This data should be
concatenated to the output produced by any preceding calls to the
decompress method. Some of the input data may be preserved in internal
buffers for later processing.
If the optional parameter {max_length} is supplied then the return value will be
no longer than {max_length}. This may mean that not all of the compressed input
can be processed; and unconsumed data will be stored in the attribute
unconsumed_tail. This string must be passed to a subsequent call to
decompress if decompression is to continue. If {max_length} is not
supplied then the whole input is decompressed, and unconsumed_tail is an
empty string.
Decompress.flush([length])~
All pending input is processed, and a string containing the remaining
uncompressed output is returned. After calling flush, the
decompress method cannot be called again; the only realistic action is
to delete the object.
The optional parameter {length} sets the initial size of the output buffer.
Decompress.copy()~
Returns a copy of the decompression object. This can be used to save the state
of the decompressor midway through the data stream in order to speed up random
seeks into the stream at a future point.
.. versionadded:: 2.5
.. seealso::
Module gzip (|py2stdlib-gzip|)
Reading and writing gzip (|py2stdlib-gzip|)\ -format files.
http://www.zlib.net
The zlib library home page.
http://www.zlib.net/manual.html
The zlib manual explains the semantics and usage of the library's many
functions.
vim:tw=78:wrap:linebreak:nolist:ts=4:ft=help:norl: