Skip to content

mikpim01/pyexcel

 
 

Repository files navigation

pyexcel - Let you focus on data, instead of file formats

image

image

image

image

image

image

Support the project

If your company has embedded pyexcel and its components into a revenue generating product, please support me on patreon or bounty source to maintain the project and develop it further.

If you are an individual, you are welcome to support me too and for however long you feel like. As my backer, you will receive early access to pyexcel related contents.

And your issues will get prioritized if you would like to become my patreon as pyexcel pro user.

With your financial support, I will be able to invest a little bit more time in coding, documentation and writing interesting posts.

Known constraints

Fonts, colors and charts are not supported.

Introduction

Feature Highlights

  1. One application programming interface(API) to handle multiple data sources:
    • physical file
    • memory file
    • SQLAlchemy table
    • Django Model
    • Python data structures: dictionary, records and array
  2. One API to read and write data in various excel file formats.
  3. For large data sets, data streaming are supported. A genenerator can be returned to you. Checkout iget_records, iget_array, isave_as and isave_book_as.

Installation

You can install pyexcel via pip:

$ pip install pyexcel

or clone it and install it:

$ git clone https://github.com/pyexcel/pyexcel.git
$ cd pyexcel
$ python setup.py install

Usage

Please note that you will have to use '.sortable.html' in order to replicate the example.

image

>>> # pip install pyexcel-text==0.2.7.1
>>> import pyexcel as p
>>> ccs_insight2 = p.Sheet()
>>> ccs_insight2.name = "Worldwide Mobile Phone Shipments (Billions), 2017-2021"
>>> ccs_insight2.ndjson = """
... {"year": ["2017", "2018", "2019", "2020", "2021"]}
... {"smart phones": [1.53, 1.64, 1.74, 1.82, 1.90]}
... {"feature phones": [0.46, 0.38, 0.30, 0.23, 0.17]}
... """.strip()
>>> ccs_insight2
pyexcel sheet:
+----------------+------+------+------+------+------+
| year           | 2017 | 2018 | 2019 | 2020 | 2021 |
+----------------+------+------+------+------+------+
| smart phones   | 1.53 | 1.64 | 1.74 | 1.82 | 1.9  |
+----------------+------+------+------+------+------+
| feature phones | 0.46 | 0.38 | 0.3  | 0.23 | 0.17 |
+----------------+------+------+------+------+------+

Suppose you have the following data in a dictionary:

Name Age
Adam 28
Beatrice 29
Ceri 30
Dean 26

you can easily save it into an excel file using the following code:

>>> import pyexcel
>>> # make sure you had pyexcel-xls installed
>>> a_list_of_dictionaries = [
...     {
...         "Name": 'Adam',
...         "Age": 28
...     },
...     {
...         "Name": 'Beatrice',
...         "Age": 29
...     },
...     {
...         "Name": 'Ceri',
...         "Age": 30
...     },
...     {
...         "Name": 'Dean',
...         "Age": 26
...     }
... ]
>>> pyexcel.save_as(records=a_list_of_dictionaries, dest_file_name="your_file.xls")

And here's how to obtain the records:

>>> import pyexcel as p
>>> records = p.iget_records(file_name="your_file.xls")
>>> for record in records:
...     print("%s is aged at %d" % (record['Name'], record['Age']))
Adam is aged at 28
Beatrice is aged at 29
Ceri is aged at 30
Dean is aged at 26
>>> p.free_resources()

Advanced usage 🔥

If you are dealing with big data, please consider these usages:

>>> def increase_everyones_age(generator): ... for row in generator: ... row['Age'] += 1 ... yield row >>> def duplicate_each_record(generator): ... for row in generator: ... yield row ... yield row >>> records = p.iget_records(file_name="your_file.xls") >>> io=p.isave_as(records=duplicate_each_record(increase_everyones_age(records)), ... dest_file_type='csv', dest_lineterminator='n') >>> print(io.getvalue()) Age,Name 29,Adam 29,Adam 30,Beatrice 30,Beatrice 31,Ceri 31,Ceri 27,Dean 27,Dean <BLANKLINE>

Two advantages of above method:

  1. Add as many wrapping functions as you want.
  2. Constant memory consumption

Available Plugins

A list of file formats supported by external plugins
Package name Supported file formats Dependencies Python versions

pyexcel-io

csv, csvz1, tsv, tsvz2

2.6, 2.7, 3.3, 3.4, 3.5, 3.6 pypy

pyexcel-xls

xls, xlsx(read only), xlsm(read only)

xlrd, xlwt

same as above

pyexcel-xlsx xlsx openpyxl same as above

pyexcel-ods3

ods

pyexcel-ezodf, lxml

2.6, 2.7, 3.3, 3.4 3.5, 3.6

pyexcel-ods ods odfpy same as above
Dedicated file reader and writers
Package name Supported file formats Dependencies Python versions
pyexcel-xlsxw xlsx(write only) XlsxWriter Python 2 and 3
pyexcel-xlsxr xlsx(read only) lxml same as above
pyexcel-odsr read only for ods, fods lxml same as above
pyexcel-htmlr html(read only) lxml,html5lib same as above
Other data renderers
Package name Supported file formats Dependencies Python versions

pyexcel-text

write only:rst, mediawiki, html, latex, grid, pipe, orgtbl, plain simple read only: ndjson r/w: json

tabulate

2.6, 2.7, 3.3, 3.4 3.5, 3.6, pypy

pyexcel-handsontable handsontable in html handsontable same as above

pyexcel-pygal

svg chart

pygal

2.7, 3.3, 3.4, 3.5 3.6, pypy

pyexcel-sortable sortable table in html csvtotable same as above

pyexcel-gantt

gantt chart in html

frappe-gantt

except pypy, same as above

In order to manage the list of plugins installed, you need to use pip to add or remove a plugin. When you use virtualenv, you can have different plugins per virtual environment. In the situation where you have multiple plugins that does the same thing in your environment, you need to tell pyexcel which plugin to use per function call. For example, pyexcel-ods and pyexcel-odsr, and you want to get_array to use pyexcel-odsr. You need to append get_array(..., library='pyexcel-odsr').

Footnotes

Acknowledgement

All great work have been done by odf, ezodf, xlrd, xlwt, tabulate and other individual developers. This library unites only the data access code.

>>> import os >>> os.unlink("your_file.xls")

License

New BSD License


  1. zipped csv file

  2. zipped tsv file

About

Single API for reading, manipulating and writing data in csv, ods, xls, xlsx and xlsm files

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 99.6%
  • Other 0.4%