Skip to content

andreipak/python-webpage-inliner

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A webpage inliner in Python

This script works by checking what external resources (css, javascript, images) a webpage references, downloading them and replacing their references with their content. Images will be replaced by data: URI's.

Requirements

Example

Lets say you have a webpage like this:

:

<html>
    <head>
        <link rel="stylesheet" type="text/css" href="style.css">
    </head>
    <body>
        <script type="text/javascript" src="script.js"></script>
        <h1>Hello world! <img src="smile.png"></h1>
    </body>
</html>

Your style.css can be something like:

body { background-image: url(sun.png); }
h1 { font-size: 12px; }

and your script.js:

alert("Welcome!");

Download the page and pass it through the inliner like this:

python-webpage-inliner -u index.html -o index-inlined.html

and you'll get the following result:

<html>
    <head>
        <style>body { background-image: url(); }
        h1 { font-size:12px; }
        </style>
    </head>
    <body>
        <script>alert("Welcome!");
        </script>
        <h1>Hello world! <img src="" /></h1>
    </body>
</html>

Now you can stash it somewhere and do neat stuff with it

Tests

You can run tests via the usual

python setup.py test

Which will install nose/mock for test runs.

Note: Tests are just starting and not all covered. Please submit more if you can find an edge case!

About

A Python script that inlines a webpage's external resources like javascript, css and images by using data: uri's

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published