Welcome to the Look-A-Like Python Package

Author: Edward Turner

Introduction

Generally, we want to be able to predict various characteristics, perhaps simultaneously, ensuring that the samples in the testing dateset that "looks like" the samples in the training dataset have similar predictive values. There are various methods that exist today that are predictive in nature, and are well documented. However, there are few that is able to ensure that samples from the testing dataset with similar features as in the testing dataset have similar predictive values.

This python package delivers a highly sought-after methodology, which utilizes the relative importance each feature has to be predictive to our chosen value and scales our features accordingly their importance, then perform a nearest neighbors algorithm to generate our matches.

A more full description of the methodology is found under the Methodology section.

Installation

One method of installing the python package, whether in a virtual environment or your own local machine, is to git clone the repo, change the directory to the python-package directory, and run python setup.py install.

Methodology

As mentioned in the introduction, we derive some values that are based on the predictive power of each feature and scale those features by those values. To do that, we use the Light Gradient Boosting Method (LGBM) to fit the training dataset. To optimize the LGBM using bayesian hyper parameter optimize on a train/validation split on the original training dataset. Once optimized, we fit on the entire training dataset. By doing so, we will generate the feature importance for each feature. We then scale our feature importance so that they are nonzero and sum to one. This is the very first step.

Once we derive our feature importance, we scale our features according to their feature importance, after standardizing our features. There are several available distance measures to use for our matching algorithm, along with different ways to find our closest neighbors. For our distance calculation, we have the p-norm measure, the mahalanobis measure, and the cosine measure. For our nearest-neighbors algorithm, we have the k-nearest-neighbors algorithm and the hungarian-matching algorithm. This gives us a total of 6 types of matching algorithms.

Tutorial

To use this model, simply follow this short example

from lal import LALGBRegressor

# to use the linear sum assigment for matches,
# pass linear_sum to k;
# and use the cosine measure, 
# pass cosine to the p value
model_params = {
                "k": "linear_sum", 
                "p": "cosine"
                }

model = LALGBRegressor(**model_params)

model.fit(train_data, train_labels)

test_labels = model.predict(
                            train_data, 
                            train_labels, 
                            test_data
                            )

As a note, it is suggested that all missing values are taken cared of before using the model.

Documentation

For code documentations, please go here

Or have a look at the code repository.

License

This work is dual-licensed under Apache 2.0 and GPL 2.0 (or any later version). You can choose between one of them if you use this work.

SPDX-License-Identifier: Apache-2.0 OR GPL-2.0-or-later

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
docs		docs
lal		lal
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
LICENSE.apache		LICENSE.apache
LICENSE.gpl		LICENSE.gpl
README.md		README.md
README.rst		README.rst
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

docs

docs

lal

lal

tests

tests

.gitignore

.gitignore

LICENSE

LICENSE

LICENSE.apache

LICENSE.apache

LICENSE.gpl

LICENSE.gpl

README.md

README.md

README.rst

README.rst

requirements.txt

requirements.txt

setup.py

setup.py

Repository files navigation

Welcome to the Look-A-Like Python Package

Introduction

Installation

Methodology

Tutorial

Documentation

License

About

Licenses found

Releases

Packages

Languages

License

Licenses found

ed-turner/look-a-like

Folders and files

Latest commit

History

Repository files navigation

Welcome to the Look-A-Like Python Package

Introduction

Installation

Methodology

Tutorial

Documentation

License

About

Topics

Resources

License

Licenses found

Stars

Watchers

Forks

Languages