This tool combines various open source tools to give insight into accessibility and performance metrics for a list of URLs. There are several parts that can be understood as such:
- This application requires a CSV wth a one column header labeled "Address" and one URL per line (ignore other comma delited data).
- A crawl can be also be executed (e.g. currently using a licenced version of ScreamingFrogSEO CLI tools https://www.screamingfrog.co.uk/seo-spider/)
- Runs Deque AXE for all URLs and produces both a detailed and summary report (including updating the associated Google Sheet) See: https://pypi.org/project/axe-selenium-python/
- Runs Lighthouse CLI for all URLs and produces both a detailed and summary report (including updating the associated Google Sheet) See: https://github.com/GoogleChrome/lighthouse
- Runs a PDF audit for all PDF URLs and produces both a detailed and summary report (including updating the associated Google Sheet) - more on this later...
NOTE: At the moment, no database is used due to an initial interest in CSV DATA ONLY . At this point, a database would make more sense and adding a function to "Export to CSV", etc.
As mentioned, simply provide a CSV with a list of URLs (column header = "Address") and select the tests to run through the web form.
Once installed, run python app.py
To get all tests running, the following is required:
sudo apt update
sudo apt install git
sudo apt-get install python3-pip
sudo apt-get install python3-venv
sudo apt-get update
sudo apt-get install software-properties-common
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt-get install python3.6
git clone https://github.com/soliagha-oc/perception.git
sudo python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python3 app.py
Install the following CLI tools for your operating system:
Download and install the matching/required chromedriver
https://chromedriver.chromium.org/downloads
Download latest version from official website and upzip it (here for instance, verson 2.29 to ~/Downloads)
wget https://chromedriver.storage.googleapis.com/2.29/chromedriver_linux64.zip
Move to /usr/local/share (or any folder) and make it executable
sudo mv -f ~/Downloads/chromedriver /usr/local/share/
sudo chmod +x /usr/local/share/chromedriver
Create symbolic links
sudo ln -s /usr/local/share/chromedriver /usr/local/bin/chromedriver
sudo ln -s /usr/local/share/chromedriver /usr/bin/chromedriver
OR
export PATH=$PATH:/path-to-extracted-file/
OR
add to .bashrc
-
Go to the geckodriver releases page. Find the latest version of the driver for your platform and download it. For example: https://github.com/mozilla/geckodriver/releases
wget https://github.com/mozilla/geckodriver/releases/download/v0.24.0/geckodriver-v0.24.0-linux64.tar.gz
-
Extract the file with:
tar -xvzf geckodriver*
-
Make it executable:
chmod +x geckodriver
-
Add the driver to your PATH so other tools can find it:
export PATH=$PATH:/path-to-extracted-file/
OR
add to
.bashrc
Install node
curl -sL https://deb.nodesource.com/setup_14.x | sudo -E bash -
sudo apt-get install -y nodejs
Install npm
npm install npm@latest -g
sudo npm install npm@latest -g
Install lighthouse
npm install -g lighthouse
sudo npm install -g lighthouse
https://www.xpdfreader.com/download.html
To install this binary package:
-
Copy the executables (pdfimages, xpdf, pdftotext, etc.) to to /usr/local/bin.
-
Copy the man pages (*.1 and *.5) to /usr/local/man/man1 and /usr/local/man/man5.
-
Copy the sample-xpdfrc file to /usr/local/etc/xpdfrc. You'll probably want to edit its contents (as distributed, everything is commented out) -- see xpdfrc(5) for details.
See: https://www.screamingfrog.co.uk/seo-spider/user-guide/general/#commandlineoptions
ScreamingFrog SEO CLI tools provide the following data sets:
- crawl_overview.csv (used to create report DASHBOARD)
- external_all.csv
- external_html.csv (used to audit external URLs)
- external_pdf.csv (used to audit external PDFs)
- h1_all.csv
- images_missing_alt_text.csv
- internal_all.csv
- internal_flash.csv
- internal_html.csv (used to audit internal URLs)
- internal_other.csv
- internal_pdf.csv (used to audit internal PDFs)
- internal_unknown.csv
- page_titles_all.csv
- page_titles_duplicate.csv
- page_titles_missing.csv
Note: There are spider config files located in the /conf folder. You will require a licence to alter the configurations.
Note: If a licence is not available, simply provide a CSV where at least one column has the header "address". See RCMP example.
Installed via pip install -r .\requirements.txt
See: https://pypi.org/project/axe-selenium-python/ and https://github.com/dequelabs/axe-core
See: https://github.com/GoogleChrome/lighthouse
While there is a /reports/ dashboard, the system is enabled to write to a Google Sheets. To do this, set up credentials for Google API authentication here: https://console.developers.google.com/apis/credentials to get a valid "credentials.json" file.
To facilitate branding and other report metrics, a "non-coder/sheet formula template" is used. Here is a sample template:
It is possible when crawling and scanning sites to encounter various security risks. Please be sure to have a virus scanner enabled to protect against JavaScript and other attacks or disable JavaScript in the configuration.