django-parallelized_querysets

Handle large Django QuerySets by spreading their execution on multiple cores and keeping the memory usage low.

Installation

pip install django-parallelized_querysets

Usage

`parallelized_queryset(queryset, processes=None, function=None)`

Process the given queryset and return the result as a list.

proceses

Number of processes to create. Defaults to the number returned by multiprocessing.cpu_count().

function

Apply a function the each result. Does not apply any function by default. The first argument is the Process which is calling it, and the second is the row.

You can also pass two hooks (function that will be executed by the process at defined times):

init_hook

Give it a function taking the Process as argument and it will be executed at soon as it's created.

end_hook

Give it a function taking the Process as argument and it will be execute right before the Process exits. If it returns a non-None value, it will be appended to the results queue.

Note

Each time your function returns None, the value won't be in the resulting list.

Note

The order in the QuerySet won't be respected!

Example

Return all the Article objects:

>>> from parallelized_querysets import parallelized_queryset
>>> qs = Article.objects.all()
>>> parallelized_queryset(qs)

Add all Article objects to a Redis index (assuming Article has a append_to_redis method):

>>> from parallelized_querysets import parallelized_queryset
>>> qs = Article.objects.all()
>>> parallelized_queryset(qs, function=lambda p, x: x.append_to_redis())

Do the same but on 6 processes:

>>> from parallelized_querysets import parallelized_queryset
>>> qs = Article.objects.all()
>>> parallelized_queryset(qs, processes=6,
                              function=lambda p, x: x.append_to_redis())

`parallelized_multiple_querysets(querysets, processes=None, function=None)`

Same as parallelized_queryset but querysets is a list of QuerySets.

Testing

./tests/sample/manage.py test sample

About `Exception AssertionError: AssertionError()`

You may see the following line (multiple times) on the standard error:

Exception AssertionError: AssertionError() in <Finalize object, dead> ignored

This is a bug in Python's garbage collector (running right after a fork), which has been fixed in Python 3.3.0 alpha4.

See http://bugs.python.org/issue14548 for more information on that bug.

License

MIT (see LICENSE).

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
parallelized_querysets		parallelized_querysets
tests/sample		tests/sample
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

parallelized_querysets

parallelized_querysets

tests/sample

tests/sample

.gitignore

.gitignore

.travis.yml

.travis.yml

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

setup.py

setup.py

Repository files navigation

django-parallelized_querysets

Installation

Usage

`parallelized_queryset(queryset, processes=None, function=None)`

Example

`parallelized_multiple_querysets(querysets, processes=None, function=None)`

Testing

About `Exception AssertionError: AssertionError()`

License

About

Releases

Packages

Languages

License

pelletier/django-parallelized_querysets

Folders and files

Latest commit

History

Repository files navigation

django-parallelized_querysets

Installation

Usage

parallelized_queryset(queryset, processes=None, function=None)

Example

parallelized_multiple_querysets(querysets, processes=None, function=None)

Testing

About Exception AssertionError: AssertionError()

License

About

Resources

License

Stars

Watchers

Forks

Languages

`parallelized_queryset(queryset, processes=None, function=None)`

`parallelized_multiple_querysets(querysets, processes=None, function=None)`

About `Exception AssertionError: AssertionError()`