Python Tests |
---|
A decentralized web gateway for open academic papers on the Internet Archive
- Names might not be consistent below as it gets edited and code built.
- Please edit to match names in the code as you notice conflicts.
- A lot of this file will be moved into actual code as the skeleton gets built, just leaving summaries here.
- Main README << You are here
- Use Cases
- Classes
- HTTP API
- Extending
- Data for the project - sqlite etc
- Proposal for meta data - first draft - looks like got deleted :-(
- google doc with IPFS integration comments #TODO: Needs revision ot match this.
- google doc with top level overview of Dweb project - best place for links to other resources & docs.
- gateway.dweb.me points at the server - which should be running the "deployed" branch.
- Gitter chat area So for example: curl https://gateway.dweb.me/info
This gateway sits between a decentralized web server running locally (in this case an Go-IPFS server) and the Archive. It will expose a set of services to the server.
The data is stored in a sqlite database that matches DOI's to hashes of the files we know of, and the URLs to retrieve them.
Note its multivalue i.e. a DOI represents an academic paper, which may be present in the archive in various forms and formats. (e.g. PDF, Doc; Final; Preprint).
See [Information flow diagram](./Academic Docs IPFS gateway.pdf)
Those services will be built from a set of microservices which may or may not be exposed.
All calls to the gateway will come through a server that routes to individual services.
Server URLs have a consistent form /outputformat/namespace/namespace-dependent-string
Where:
- outputformat: Extensible format wanted e.g. IPLD or nameresolution
- namespace: is a extensible descripter for name spaces e.g. "doi"
- namespace-dependent-string: is a string, that may contain additional "/" dependent on the namespace.
This is implemented as a pair of steps
- first the name is passed to a class representing the name space, and then the object is passed to a class for the outputformat that can interpret it, and then a "content" method is called to output something for the client.
See HTTPServer for how this is processed in an extensible form.
See UseCases and Classes for expansion of this
See HTTPS API for the API exposed by the URLs.
This should work, someone please confirm on a clean(er) machine and remove this comment.
You'll also need REDIS, Supervisor and IPFS On a Mac
brew install redis
brew services start redis
brew install supervisor
<< need install info for go-ipfs running on port 5001>>
ipfs config show #To view ipfs port settings
On a Linux Supervisor install details https://pastebin.com/ctEKvcZt http://supervisord.org/installing.html
git clone http://github.com/ArchiveLabs/dweb_gateway.cid
cd dweb_gateway
scripts/install.sh # Should install, get data and run
# Then try
curl http://localhost:4244/info
curl http://localhost:4244/doi/10.1001/jama.2009.1064?verbose=True
curl https://gateway.dweb.me/content/contenthash/5dr1gqVNt1mPzCL2tMRSMnJpWsJ5Qs?verbose=True