Skip to content

tychon/mails

Repository files navigation

This project is NOT in Python 3
This file is NOT in markdown :D

Special prerequisite: lrparsing module:
    https://pypi.python.org/pypi/lrparsing
    http://ace-host.stuart.id.au/russell/files/lrparsing/doc/html/

You can organize your configurations on diferent systems by creating
branches for every host / user. Then you can pull in changes and propagate
diffs easily by cherry-picking commits.
But be aware, that your passwords are in the configs!
You sould never commit them :-D (like i did accidentally).
For this reason in the .gitignore is config_passwd.py
Do 'from config_passwd import *' at the end of your config.py
and then put 'couchdb_auth=...' into config_passwd.py

####
TODO

* support adding labels to mail by editing header with Labels: field
* write full text search indexing
* Add GPG signed and encrypted to autolabelling
* Add license conforming lrparsing.py license
* use '#!/usr/bin/env python' as shebang
* nice -h command line args
* Documentation
* Write User guide (mostly for myself :-D)

* Contacts book
  For decoding display names see:
  http://stackoverflow.com/questions/4157899/get-python-getaddresses-to-decode-encoded-word-encoding


##########
Content

* Python Scripts
* Database
* Search Queries
* Labelling
* Python Logging


##############
Python Scripts

For a more specific documentation see the documentation leading the scripts.
Most of the executable scripts are also modules you can include.

## Executable
* upload
  Parse and upload a single mail and its metadata.
* search
  Search for mail hashes by metadata.
* download
  Download mails with given hashes.
* read
  Execute a command for every file in a mbox / Maildir.
* view
  Start mutt for local mailbox and upload metadata of changed mails when mutt
  closed.
* changes
  Listen to _changes API and execute commands on new mail.
  Or setup continuous replication with --setup-continuous
* delete
  Delete documents with given hashes.
* conflict
  Solve conflicts by selecting one winner basing on metadata.
* reprocess
  Reprocess all or selection of documents in database to correct labels or
  keyword index.

## Modules only
* config
  Configurations for logs, the CouchDB connection and other things.
* common
  Sets up global logging system and offers miscellaneous functions.
* labels
  Parse autolabelling file and add labels to parsed metadata.
* parseaddr
  Parse address list in From: and To: header field according to 5322
* searchqueryparser
  Parse search queries for search.py

## Other Modules

* lrparsing
  Module by Russell Stuart under GPLv2, or any later version.

########
Database

## Document format

doc_id: sha256 hash of whole mail as in 'data' field
type: 'mail'
upload_date: date UTC, when the mail was uploaded in YYYY-MM-DD hh:mm:ss
            `date "+%Y-%M-%d %H:%m:%S"` i.e. 2013-12-08 11:12:42
labels: list of unique label names, all lowercase

date: date UTC given in mail, in YYYY-MM-DD hh:mm:ss
from: array of email addresses of senders, all lowercase
to: array of email addresses of receivers, all lowercase
subject: string containing subject

original message as standalone attachment in named 'mail'
  with content-type: text/plain

## Database Views

The design document is saved under /_design-post
Upload it to your DB with curl or on any other way.

  from

Emits one entry for every sender in From: field of a mail.
So you can search with
.../_design/post/_view/from?startkey=%22someone%22&endkey=%22xaver%22
Key: one address spec
Value: doc

  to

Emits one entry for every recipient in To: field of a mail.
Key: one address spec
Value: doc

  labels

Emits one entry for every label listed in doc.labels .
Key: one label string
Value: doc

  labels_list

A unique list of all labels used in all doc.labels .
Functions: Map & Reduce
Key: Null
Value: List of all labels

  inbox

All mails labeled "unread".
Functions: Map
Key: date
Value: flags

##############
Search Queries

#TODO write doc for search queries

#########
Labelling

In config.py the configuration
autolabels = 'PATH'
you can put a file with rules to add labels to metadata while parsing mails.

The file is made of sections, every section separated by a single newline.
The section header '[spam]' gives the name of the label to be added.
Then the rules follow.
A rule is made up of a keyword FROM, TO, DATE or LABEL followed by a colon and
a regular expression enclosed in slashes. Example:

[spam]
FROM:/update.*@facebook.*/

[spam]
FROM:/info@twitter.com/
DATE:/2014*/

This one marks all mails from update.*@facebook.* as spam and all mails that
are from info@twitter.com in the year 2014. (This rule is somehow questionable)
A label is added, if every rule in a section matched the metadata.
To achieve a OR conjunction create multiple sections for the same label as seen
above.

If you want to check a autolabelling config file for sanity, do:

$ python
> import labels
> l = labels.Labeller('PATH')
> print l.check({'from':['john.doe@example.com'], 'to':['mark.twain@gmx.com'], 'date':'2004-06-30 00:03:57'})

There you can also check, if your rules work. The result should be an array
containing the labels.

##############
Python Logging

By importing the python module common the logging system is set up globally.
A logger 'stderr' is created for output to stderr.
A logger 'none' is create for ignoring any logging made.
Any other logger automatically writes to stderr and the error log defined in
config.errorlog = 'PATH'
The sys.excepthook is set, so that any exception not catched is written to
stderr and the error log. Then the program exits with code 1.

About

Store your mails in a CouchDB and view them with mutt. Supports searching by metadata.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages