Skip to content

kylelin47/linear-regression

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

linear-regression

A simple linear regression machine learning program, written in Python 3.4

Dependencies

You should not have to install these manually if you follow the instructions under 'Installation'.

Installation

  1. Get the files within this repository. For example: git clone https://github.com/kylelin47/linear-regression.git
  2. In a command prompt, navigate to where you saved this repository on your computer. You should see the file setup.py.
  3. Install setuptools
  4. Run
> python setup.py install
# do not use 'pip install .', this is just a script
  1. You should now be able to run 'regress.py' directly

Training Sets and Testing Sets

regress.py will use the training set to learn a weight matrix that it will apply to the attributes of each entry in the testing set to predict that entry's category. It will compare its prediction to the actual category and see if its prediction is correct. After all entries are processed, it will report its overall accuracy.

The training set must contain at least one example of every category present in the testing set.

Data Format

Delimiter separated numbers (as specified with the --delimiter flag. defaults to a comma), with category being the first number. For example, with 3 categories, you can represent each category as 1 2 or 3. Each entry should be on a new line.

Category must be an integer. Categories should be in sequence, though they do not have to start from 1.

The following data

1,5.1,3.5,1.4,0.2
2,3.1,1.5,2.4,0.4

represents two entries, the first of category 1 and the second of category 2.

Example data are in the data/iris and data/wine folders.

Entries with a different number of attributes than the maximum in the training dataset are ignored. For example,

Training

1,5.1,3.5,1.4
1,5.1,3.5,1.4,0.2
2,3.1,1.5,2.4

Testing

1,5.7,4.4,1.5
1,5.4,3.9,1.3,0.4
1,5.1,3.5,1.4,0.3,0.7

will be interpreted as

Training

1,5.1,3.5,1.4,0.2

Testing

1,5.4,3.9,1.3,0.4

Example Usage

> regress.py TRAINING_SET TESTING_SET
Results:  correct/total
Accuracy: n%

> regress.py TRAINING_SET TESTING_SET --weight
W =
[[W1]
 [W2]
  .
  .
  .
 [Wm]]
Results:  correct/total
Accuracy: n%

> regress.py TRAINING_SET TESTING_SET --verbose
Category 1: correct/total, %
Category 2: correct/total, %
.
.
.
Category N: correct/total, %
Results:  correct/total
Accuracy: n%

> regress.py TRAINING_SET TESTING_SET --delimiter=" : "
Results:  correct/total
Accuracy: n%

--delimiter=" : " means your data looks like

1 : 0.7 : 0.9 : 2.5

--delimiter=, means your data looks like

1,0.7,0.9,2.5

About

A simple linear regression machine learning program

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages