Skip to content

A collection of tools to make it easier to generate wordlists and collect tripcodes

License

MIT, MIT licenses found

Licenses found

MIT
LICENSE
MIT
LICENSE-IWI
Notifications You must be signed in to change notification settings

crypt3lx2k/Tripcode-Dictionary-Tools

Repository files navigation

Tripcode Dictionary Tools is a collection of software that makes it easy to
collect tripcodes, gather candidates for wordlists and operate on 4chan
boards/pages/threads.

First of all, if the only reason you're reading this is because you want to
crack peoples tripcodes to be disruptive and annoying, fuck off.

If you violate the terms of service of the 4chan API typically with excessive
requests you might get banned.
If you violate the rules of 4chan on 4chan itself you will get banned, this
includes dumping peoples tripcodes.
If you're a douche, fuck off.

Second of all, this isn't a plug and play type of software. You most likely have
to do some work to do what you want to do, that being said this package provides
several programs that are designed to make the job easier for you.

Remember that you can invoke all of these programs with either -h or --help to
see a full list of what they offer individually.

Primarily 6 programs are provided, all of these programs accept input in the
form of 4chan links.

A 'link' can either be a full thread URL like
https://boards.4chan.org/g/thread/36491091
boards.4chan.org/g/thread/36491091
or something simpler like
/g/thread/36491091

For a board you can either write
https://boards.4chan.org/g/
boards.4chan.org/g/
/g/
or
g

If you specify a board as a link the entire board is scraped.

You can also specify specific pages on boards,
https://boards.4chan.org/g/1
boards.4chan.org/g/1
/g/1
or
g/1

If you specify a page as a link every thread on that page is scraped.

crack.py:
This program uses the tripcodes/public.db3 and tripcodes/secure.db3
(optionally others, invoke with --help) tripcode databases and checks them
versus posts, and prints cracked tripcodes to the screen.

The databases does not come with the program, you have to generate them
yourself, there are several programs that are designed to aid you with this
task.

dump_hashes.py:
This program writes every unique tripcode found to a file. You can then use this
file with an external cracking program like John the Ripper (1.8.0+ supports
tripcodes out of the box, but without UTF-8/SJIS support or HTML replacement).

dump_words.py:
This program looks all over for potential words that can be used as tripcodes,
this includes in subjects, filenames and e-mail fields and writes the unique
words found to a file.

The file is not sorted and contains one word per line, the words are potentially
a lot longer than the maximum 8 characters used to create a tripcode.

dump_ngrams.py:
This program looks in comments for ngrams and writes them to a file, the filter
used to generate 'tokens' for this program is a lot stricter than for the
dump_words.py program.

The file is sorted by occurrence and has the format
<space separated list of n words> <number of occurrences>

These 4 programs all potentially use a lot of bandwidth, in accordance with the
4chan API all of them buffer pages and use if-modified where applicable.

So if you have cached a thread by operating on it earlier, you might avoid
downloading the thread again, but a request is still made to 4chans servers.

This is backed by a cache file bin/cache.bin (optionally something else), if
you have downloaded a specific board of 4chan already and want to operate on
that you can invoke all of the above programs with the --offline flag, this
makes them only use the cache and no internet.

To aid this practice,

build_cache.py:
This program just builds the cache and does nothing else, you can afterwards
invoke all of the above programs with --offline. Note that when running a
program in offline mode it will set the number of threads to 1, as multiple
threads in Python only tend to slow things down when major blocking I/O (like
downloading) isn't involved.

So for instance if you want to dump all the hashes on /g/, also dump the words
and a couple of ngrams you can do this.
$ ./build_cache /g/
$ ./dump_hashes --offline tripcodes.txt /g/
$ ./dump_words --offline words.txt /g/
$ ./dump_ngrams --offline bigrams.txt 2 /g/
...

Note that all of the above programs also build the cache, so if you just write
$ ./dump_hashes tripcodes.txt /g/
you will also have a cached version on /g/ on your machine.

prune_cache.py:
This program prunes 404'ed entries from the cache, if you run this sporadically
you'll avoid the cache file growing too big, if you want to build an archive or
something like that you obviously don't want to run this program as old threads
are simply deleted.

You can also run this program in offline mode, take care when using this as it
uses the offline board thread lists to see what threads are unreachable, you
could potentially prune a new thread if you have been working on it without
touching the board thread list.

The end user is also presented with 3 secondary programs,
util/makesql:
This program reads a file with tripcode/solution pairs and makes databases for
use with the ./crack.py program, it optionally accepts a format string in the
form of a regex, invoke with --help for details.

util/johntosql:
This program reads a john.pot file generated by John the Ripper and creates a
database based on that.

util/triptestertosql:
This programs read a file generated by Tripcode-Tester
(https://github.com/crypt3lx2k/Tripcode-Tester) and generates a database based
on that.

A small tripcode list to start off with is located at
http://www.pageoftext.com/PH_plain&nm_page=secure_tripcode_dictionary
it's small for a regular tripcode list but it's the most significant public
collection of secure tripcodes I have found on the internet.

As far as I can tell the list was created by user 'jeb3' on userscripts.org
http://userscripts.org/users/77660
specifically for the script 'Tripcode Breaker'
http://userscripts.org/scripts/review/68857
so all credit goes to 'jeb3' for compiling that list.

I do not host that list so it might go offline whenever and without notice.

If you want to start by using that list you can use util/makesql in the
following manner, we start off by downloading the list
$ wget 'http://www.pageoftext.com/PH_plain&nm_page=secure_tripcode_dictionary' \
-O tripcodes.txt
then we generate a regular and a secure tripcode database based on it
$ util/makesql --regex='\solution!\tripcode!!.{11}' tripcodes.txt \
tripcodes/public.db3
$ util/makesql --regex='\solution!.{10}!!\secure' tripcodes.txt \
tripcodes/secure.db3

If you then do
$ ./crack.py /sp/
you might get some results, but it's not very likely due to the small size of
the database.

You're probably better off creating wordlists and using them with John the
Ripper, you might also want to try it with large public leaks like the rockyou
list.

Last but not least I have to mention that you can of course use tdt from the
Python shell itself as a module. The programs themselves serve as examples how
to do this.

FAQ:
Q:
Do you hate tripcode users?
A:
No, software like this actually helps tripcode users choose better tripcodes as
it makes weak tripcodes very apparent and therefore easy to avoid.

Q:
Some douche leaked my tripcode on 4chan, what do I do now?
A:
First of all you report the post where the leak happened for a rule violation.

Then you recover, you can keep using your tripcode knowing that it is public,
or you can choose a better one, very rarely do people persist in imitating
others for long periods of time.

If you're a high quality contributor e.g. you often contribute in draw threads
or write threads people will know when you're being imitated and when you're
actually posting as your imitators lack the artistic skills to properly imitate
you.

Q:
How do I choose a good tripcode?
A:
The short answer to this is use a secure tripcode. After using it once you then
google the hash and if you get no results on google you're probably in the
clear.

If you want to use a regular tripcode the same rules apply as for passwords with
a few special modifications.

This should be apparent, don't use dictionary words, this includes phrases
commonly posted on 4chan, don't use combinations of dictionary words.

Use all 8 characters, avoid the characters "<>& as they expand to their
corresponding HTML entities and make your tripcode effectively very short.

If you use 8 characters and you choose from alphanumeric characters (A-Za-z0-9)
there are (26+26+10)^8 = 62^8 = 218340105584896 possible combinations, against a
GPU with 130 million tripcodes per second it will take about 19.4 days running
non-stop to exhaust the keyspace.

If you use certain Japanese characters in your tripcode you effectively expand
the number of combinations to a number greater than 2^56, against the same GPU
it will then take 17.5 years to exhaust the keyspace. So while it is possible to
brute force tripcodes it still takes an unreasonable amount of time unless
someone decides to dedicate a lot of resources to it.

The most secure tripcodes are those generated by a program like MTY, as these
are completely random and therefore model very strong passwords. So if you want
to use a regular tripcode you might as well spend a little time generating one
with a specific phrase in it.

About

A collection of tools to make it easier to generate wordlists and collect tripcodes

Resources

License

MIT, MIT licenses found

Licenses found

MIT
LICENSE
MIT
LICENSE-IWI

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published