Skip to content

pqhuy98/DistrComp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DistrComp

A distributed computing library.

PREFACE

This library was implemented when I started learning about socket programming and distributed computing (January 2017). Its functionality is pretty good. However, the implementation is naive and contains a serious security hole. Also it's better to use MPI. This project is now an antique that reminds me what I've been through during my education.

Introduction

This library contains 4 main functions : node_id, n_node, send, recv. Using these 4 functions, one can parallelize a program across multi machines.
You can send / receive many types of data (list, dict, set, numpy array,...), as long as they are pickle-able.

Syntax example

"""
Node i sends a string to the node (i+1) % number_of_nodes
and receive its string from node (i-1) % number_of_nodes (ie. a circle).
"""
import message as ms

ms.setup_connection()

myID = ms.node_id
nodes = ms.n_node
nxt = (myID+1)%nodes
pre = (myID-1+nodes)%nodes # avoid negative value

print "My ID is", myID
ms.send(nxt, "Next to %i is %i" % (myID, nxt))

msg = ms.recv(pre)
print msg

ms.close_connection()

Output on my cluster :

--------(Node localhost     returns)--------
My ID is 2
Next to 1 is 2

--------(Node 192.168.0.113 returns)--------
My ID is 3
Next to 2 is 3

--------(Node 192.168.0.179 returns)--------
My ID is 1
Next to 0 is 1

--------(Node 192.168.0.169 returns)--------
My ID is 0
Next to 3 is 0

Total time : 1.523.

How to use

Suppose you have N machines, then 1 of them would be master and N-1 of them would be workers. You want to run myprogram.py on those machines.

On each worker machine :

  1. Put distrComp.py and worker.py into a folder.
  2. Run command line python -B worker.py inside that folder.

On master machine :

  1. Create file peers.txt contains IPs of all machines.
  2. Place peers.txt, distrComp.py, message.py and master.py into the folder which contains myprogram.py.
  3. Tweak master.py constants to fit your purpose. Read the comments. Don't be afraid to read the code.
  4. When you're ready to go, run python -B master.py. Your myprogram.py would be automatically executed by all machines at the same time.
  5. If you're still confused, then read...

How it works

When you run worker.py, the machine will listen to a determined port (default : 6969). When you run master.py, the master will connect to all workers whose IPs are specified in peers.txt, or more precisely, the variable IPs inside master.py.

By tweaking master.py, you specify which files (eg. source code, header files,...) you want send to workers and which terminal commands (eg. python XXX.py, g++ main.cpp ; ./a.out,...) you want to execute simultaneously. The master will send those files and commands (encoded as a binary string) to workers.

Workers will receive the string, decode it, save the files into a temporary folder, set the folder as current working directory and spawn subprocesses to execute the commands.

Output from STDOUT of those subprocesses are sent back to master to be printed out.

What these files do

distrComp.py : required by master.py and worker.py
master.py and worker.py : scripts to run on master and workers.
message.py : handle connection between machines, implementation of message.send, message.recv.

About

A distributed computing library

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages