def scan(self, package, categories=None, onerror=None, ignore=None): """ Scan a Python package and any of its subpackages. All top-level objects will be considered; those marked with venusian callback attributes related to ``category`` will be processed. The ``package`` argument should be a reference to a Python package or module object. The ``categories`` argument should be sequence of Venusian callback categories (each category usually a string) or the special value ``None`` which means all Venusian callback categories. The default is ``None``. The ``onerror`` argument should either be ``None`` or a callback function which behaves the same way as the ``onerror`` callback function described in http://docs.python.org/library/pkgutil.html#pkgutil.walk_packages . By default, during a scan, Venusian will propagate all errors that happen during its code importing process, including :exc:`ImportError`. If you use a custom ``onerror`` callback, you can change this behavior. Here's an example ``onerror`` callback that ignores :exc:`ImportError`:: import sys def onerror(name): if not issubclass(sys.exc_info()[0], ImportError): raise # reraise the last exception The ``name`` passed to ``onerror`` is the module or package dotted name that could not be imported due to an exception. .. versionadded:: 1.0 the ``onerror`` callback The ``ignore`` argument allows you to ignore certain modules, packages, or global objects during a scan. It should be a sequence containing strings and/or callables that will be used to match against the full dotted name of each object encountered during a scan. The sequence can contain any of these three types of objects: - A string representing a full dotted name. To name an object by dotted name, use a string representing the full dotted name. For example, if you want to ignore the ``my.package`` package *and any of its subobjects or subpackages* during the scan, pass ``ignore=['my.package']``. - A string representing a relative dotted name. To name an object relative to the ``package`` passed to this method, use a string beginning with a dot. For example, if the ``package`` you've passed is imported as ``my.package``, and you pass ``ignore=['.mymodule']``, the ``my.package.mymodule`` mymodule *and any of its subobjects or subpackages* will be omitted during scan processing. - A callable that accepts a full dotted name string of an object as its single positional argument and returns ``True`` or ``False``. For example, if you want to skip all packages, modules, and global objects with a full dotted path that ends with the word "tests", you can use ``ignore=[re.compile('tests$').search]``. If the callable returns ``True`` (or anything else truthy), the object is ignored, if it returns ``False`` (or anything else falsy) the object is not ignored. *Note that unlike string matches, ignores that use a callable don't cause submodules and subobjects of a module or package represented by a dotted name to also be ignored, they match individual objects found during a scan, including packages, modules, and global objects*. You can mix and match the three types of strings in the list. For example, if the package being scanned is ``my``, ``ignore=['my.package', '.someothermodule', re.compile('tests$').search]`` would cause ``my.package`` (and all its submodules and subobjects) to be ignored, ``my.someothermodule`` to be ignored, and any modules, packages, or global objects found during the scan that have a full dotted name that ends with the word ``tests`` to be ignored. Note that packages and modules matched by any ignore in the list will not be imported, and their top-level code will not be run as a result. A string or callable alone can also be passed as ``ignore`` without a surrounding list. .. versionadded:: 1.0a3 the ``ignore`` argument """ pkg_name = package.__name__ if ignore is not None and not is_nonstr_iter(ignore): ignore = [ignore] def _ignore(fullname): if ignore is not None: for ign in ignore: if isinstance(ign, str): if ign.startswith('.'): # leading dotted name relative to scanned package if fullname.startswith(pkg_name + ign): return True else: # non-leading-dotted name absolute object name if fullname.startswith(ign): return True else: # function if ign(fullname): return True return False def invoke(mod_name, name, ob): fullname = mod_name + '.' + name if _ignore(fullname): return category_keys = categories try: # Some metaclasses do insane things when asked for an # ``ATTACH_ATTR``, like not raising an AttributeError but # some other arbitary exception. Some even shittier # introspected code lets us access ``ATTACH_ATTR`` far but # barfs on a second attribute access for ``attached_to`` # (still not raising an AttributeError, but some other # arbitrary exception). Finally, the shittiest code of all # allows the attribute access of the ``ATTACH_ATTR`` *and* # ``attached_to``, (say, both ``ob.__getattr__`` and # ``attached_categories.__getattr__`` returning a proxy for # any attribute access), which either a) isn't callable or b) # is callable, but, when called, shits its pants in an # potentially arbitrary way (although for b, only TypeError # has been seen in the wild, from PyMongo). Thus the # catchall except: return here, which in any other case would # be high treason. attached_categories = getattr(ob, ATTACH_ATTR) if not attached_categories.attached_to(mod_name, name, ob): return except: return if category_keys is None: category_keys = list(attached_categories.keys()) category_keys.sort() for category in category_keys: callbacks = attached_categories.get(category, []) for callback, cb_mod_name, liftid, scope in callbacks: if cb_mod_name != mod_name: # avoid processing objects that were imported into this # module but were not actually defined there continue callback(self, name, ob) for name, ob in getmembers(package): # whether it's a module or a package, we need to scan its # members; walk_packages only iterates over submodules and # subpackages invoke(pkg_name, name, ob) if hasattr(package, '__path__'): # package, not module results = walk_packages(package.__path__, package.__name__ + '.', onerror=onerror, ignore=_ignore) for importer, modname, ispkg in results: loader = importer.find_module(modname) if loader is not None: # happens on pypy with orphaned pyc try: if hasattr(loader, 'etc'): # python < py3.3 module_type = loader.etc[2] else: # pragma: no cover # py3.3b2+ (importlib-using) module_type = imp.PY_SOURCE fn = loader.get_filename() if fn.endswith(('.pyc', '.pyo', '$py.class')): module_type = imp.PY_COMPILED # only scrape members from non-orphaned source files # and package directories if module_type in (imp.PY_SOURCE, imp.PKG_DIRECTORY): # NB: use __import__(modname) rather than # loader.load_module(modname) to prevent # inappropriate double-execution of module code try: __import__(modname) except Exception: if onerror is not None: onerror(modname) else: raise module = sys.modules.get(modname) if module is not None: for name, ob in getmembers(module, None): invoke(modname, name, ob) finally: if (hasattr(loader, 'file') and hasattr(loader.file, 'close')): loader.file.close()
def scan(self, package, categories=None, onerror=None, ignore=None): """ Scan a Python package and any of its subpackages. All top-level objects will be considered; those marked with venusian callback attributes related to ``category`` will be processed. The ``package`` argument should be a reference to a Python package or module object. The ``categories`` argument should be sequence of Venusian callback categories (each category usually a string) or the special value ``None`` which means all Venusian callback categories. The default is ``None``. The ``onerror`` argument should either be ``None`` or a callback function which behaves the same way as the ``onerror`` callback function described in http://docs.python.org/library/pkgutil.html#pkgutil.walk_packages . By default, during a scan, Venusian will propagate all errors that happen during its code importing process, including :exc:`ImportError`. If you use a custom ``onerror`` callback, you can change this behavior. Here's an example ``onerror`` callback that ignores :exc:`ImportError`:: import sys def onerror(name): if not issubclass(sys.exc_info()[0], ImportError): raise # reraise the last exception The ``name`` passed to ``onerror`` is the module or package dotted name that could not be imported due to an exception. .. note:: the ``onerror`` callback is new as of Venusian 1.0. The ``ignore`` argument allows you to ignore certain modules, packages, or global objects during a scan. It should be a sequence containing strings and/or callables that will be used to match against the full dotted name of each object encountered during a scan. The sequence can contain any of these three types of objects: - A string representing a full dotted name. To name an object by dotted name, use a string representing the full dotted name. For example, if you want to ignore the ``my.package`` package *and any of its subobjects or subpackages* during the scan, pass ``ignore=['my.package']``. - A string representing a relative dotted name. To name an object relative to the ``package`` passed to this method, use a string beginning with a dot. For example, if the ``package`` you've passed is imported as ``my.package``, and you pass ``ignore=['.mymodule']``, the ``my.package.mymodule`` mymodule *and any of its subobjects or subpackages* will be omitted during scan processing. - A callable that accepts a full dotted name string of an object as its single positional argument and returns ``True`` or ``False``. For example, if you want to skip all packages, modules, and global objects with a full dotted path that ends with the word "tests", you can use ``ignore=[re.compile('tests$').search]``. If the callable returns ``True`` (or anything else truthy), the object is ignored, if it returns ``False`` (or anything else falsy) the object is not ignored. *Note that unlike string matches, ignores that use a callable don't cause submodules and subobjects of a module or package represented by a dotted name to also be ignored, they match individual objects found during a scan, including packages, modules, and global objects*. You can mix and match the three types of strings in the list. For example, if the package being scanned is ``my``, ``ignore=['my.package', '.someothermodule', re.compile('tests$').search]`` would cause ``my.package`` (and all its submodules and subobjects) to be ignored, ``my.someothermodule`` to be ignored, and any modules, packages, or global objects found during the scan that have a full dotted name that ends with the word ``tests`` to be ignored. Note that packages and modules matched by any ignore in the list will not be imported, and their top-level code will not be run as a result. A string or callable alone can also be passed as ``ignore`` without a surrounding list. .. note:: the ``ignore`` argument is new as of Venusian 1.0a3. """ pkg_name = package.__name__ if ignore is not None and not is_nonstr_iter(ignore): ignore = [ignore] def _ignore(fullname): if ignore is not None: for ign in ignore: if isinstance(ign, str): if ign.startswith('.'): # leading dotted name relative to scanned package if fullname.startswith(pkg_name + ign): return True else: # non-leading-dotted name absolute object name if fullname.startswith(ign): return True else: # function if ign(fullname): return True return False def invoke(mod_name, name, ob): fullname = mod_name + '.' + name if _ignore(fullname): return category_keys = categories try: # Some metaclasses do insane things when asked for an # ``ATTACH_ATTR``, like not raising an AttributeError but # some other arbitary exception. Some even shittier # introspected code lets us access ``ATTACH_ATTR`` far but # barfs on a second attribute access for ``attached_to`` # (still not raising an AttributeError, but some other # arbitrary exception). Finally, the shittiest code of all # allows the attribute access of the ``ATTACH_ATTR`` *and* # ``attached_to``, (say, both ``ob.__getattr__`` and # ``attached_categories.__getattr__`` returning a proxy for # any attribute access), which either a) isn't callable or b) # is callable, but, when called, shits its pants in an # potentially arbitrary way (although for b, only TypeError # has been seen in the wild, from PyMongo). Thus the # catchall except: return here, which in any other case would # be high treason. attached_categories = getattr(ob, ATTACH_ATTR) if not attached_categories.attached_to(ob): return except: return if category_keys is None: category_keys = list(attached_categories.keys()) category_keys.sort() for category in category_keys: callbacks = attached_categories.get(category, []) for callback, callback_mod_name in callbacks: if callback_mod_name != mod_name: # avoid processing objects that were imported into this # module but were not actually defined there continue callback(self, name, ob) for name, ob in getmembers(package): # whether it's a module or a package, we need to scan its # members; walk_packages only iterates over submodules and # subpackages invoke(pkg_name, name, ob) if hasattr(package, '__path__'): # package, not module results = walk_packages(package.__path__, package.__name__+'.', onerror=onerror, ignore=_ignore) for importer, modname, ispkg in results: loader = importer.find_module(modname) if loader is not None: # happens on pypy with orphaned pyc try: if hasattr(loader, 'etc'): # python < py3.3 module_type = loader.etc[2] else: # pragma: no cover # py3.3b2+ (importlib-using) module_type = imp.PY_SOURCE fn = loader.get_filename() if fn.endswith(('.pyc', '.pyo', '$py.class')): module_type = imp.PY_COMPILED # only scrape members from non-orphaned source files # and package directories if module_type in (imp.PY_SOURCE, imp.PKG_DIRECTORY): # NB: use __import__(modname) rather than # loader.load_module(modname) to prevent # inappropriate double-execution of module code try: __import__(modname) except Exception: if onerror is not None: onerror(modname) else: raise module = sys.modules.get(modname) if module is not None: for name, ob in getmembers(module, None): invoke(modname, name, ob) finally: if ( hasattr(loader, 'file') and hasattr(loader.file,'close') ): loader.file.close()
def scan(self, package, categories=None, onerror=None, ignore=None, recursive=True): """Scan a Python package and any of its subpackages. All top-level objects will be considered; those marked with venusian callback attributes related to ``category`` will be processed. The ``package`` argument should be a reference to a Python package or module object. The ``categories`` argument should be sequence of Venusian callback categories (each category usually a string) or the special value ``None`` which means all Venusian callback categories. The default is ``None``. The ``onerror`` argument should either be ``None`` or a callback function which behaves the same way as the ``onerror`` callback function described in http://docs.python.org/library/pkgutil.html#pkgutil.walk_packages . By default, during a scan, Venusian will propagate all errors that happen during its code importing process, including :exc:`ImportError`. If you use a custom ``onerror`` callback, you can change this behavior. Here's an example ``onerror`` callback that ignores :exc:`ImportError`:: import sys def onerror(name): if not issubclass(sys.exc_info()[0], ImportError): raise # reraise the last exception The ``name`` passed to ``onerror`` is the module or package dotted name that could not be imported due to an exception. .. versionadded:: 1.0 the ``onerror`` callback The ``ignore`` argument allows you to ignore certain modules, packages, or global objects during a scan. It should be a sequence containing strings and/or callables that will be used to match against the full dotted name of each object encountered during a scan. The sequence can contain any of these three types of objects: - A string representing a full dotted name. To name an object by dotted name, use a string representing the full dotted name. For example, if you want to ignore the ``my.package`` package *and any of its subobjects or subpackages* during the scan, pass ``ignore=['my.package']``. - A string representing a relative dotted name. To name an object relative to the ``package`` passed to this method, use a string beginning with a dot. For example, if the ``package`` you've passed is imported as ``my.package``, and you pass ``ignore=['.mymodule']``, the ``my.package.mymodule`` mymodule *and any of its subobjects or subpackages* will be omitted during scan processing. - A callable that accepts a full dotted name string of an object as its single positional argument and returns ``True`` or ``False``. For example, if you want to skip all packages, modules, and global objects with a full dotted path that ends with the word "tests", you can use ``ignore=[re.compile('tests$').search]``. If the callable returns ``True`` (or anything else truthy), the object is ignored, if it returns ``False`` (or anything else falsy) the object is not ignored. *Note that unlike string matches, ignores that use a callable don't cause submodules and subobjects of a module or package represented by a dotted name to also be ignored, they match individual objects found during a scan, including packages, modules, and global objects*. You can mix and match the three types of strings in the list. For example, if the package being scanned is ``my``, ``ignore=['my.package', '.someothermodule', re.compile('tests$').search]`` would cause ``my.package`` (and all its submodules and subobjects) to be ignored, ``my.someothermodule`` to be ignored, and any modules, packages, or global objects found during the scan that have a full dotted name that ends with the word ``tests`` to be ignored. Note that packages and modules matched by any ignore in the list will not be imported, and their top-level code will not be run as a result. A string or callable alone can also be passed as ``ignore`` without a surrounding list. .. versionadded:: 1.0a3 the ``ignore`` argument The ``recursive`` argument allows you to turn off recursive sub-module scanning. Normally when you supply scan with a Python package, it automatically also scans any sub-modules and sub-packages within the packages recursively. You can turn this off by passing in ``recursive=False``. When ``recursive`` is set to ``False``, only the ``__init__.py`` of a package is scanned, nothing else. The behavior for scanning non-package modules is unaffected by ``recursive``. By default, recursive is set to ``True``. .. versionadded:: 1.1a1 the ``recursive`` argument """ pkg_name = package.__name__ if ignore is not None and not is_nonstr_iter(ignore): ignore = [ignore] def _ignore(fullname): return _is_ignored(fullname, pkg_name, ignore) self._scan_module(pkg_name, package, _ignore, categories) # if not a package or recursive is disabled, we are done now if not recursive or not hasattr(package, '__path__'): return results = walk_packages(package.__path__, package.__name__+'.', onerror=onerror, ignore=_ignore) for importer, modname, ispkg in results: loader = importer.find_module(modname) if loader is not None: # happens on pypy with orphaned pyc try: self._scan_submodule(loader, modname, _ignore, categories, onerror) finally: if ( hasattr(loader, 'file') and hasattr(loader.file,'close') ): loader.file.close()