GitHub - sbha2431/Abstractions-Learning

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
.idea		.idea
Code		Code
NaoCodeBackup		NaoCodeBackup
Class1.txt		Class1.txt
Class2.txt		Class2.txt
Example.py		Example.py
Example2.py		Example2.py
Learning.py		Learning.py
Learning.pyc		Learning.pyc
README		README
abstraction.py		abstraction.py
abstraction.pyc		abstraction.pyc
beliefMDP.txt		beliefMDP.txt
beliefStateMapping.txt		beliefStateMapping.txt
crashes.txt		crashes.txt
digraph.py		digraph.py
digraph.pyc		digraph.pyc
gridworld.py		gridworld.py
gridworld.pyc		gridworld.pyc
gridworld_example.py		gridworld_example.py
hard_policy.txt		hard_policy.txt
hardcoder.py		hardcoder.py
mdp.py		mdp.py
mdp.pyc		mdp.pyc
nfa.py		nfa.py
nfa.pyc		nfa.pyc
qlearning.py		qlearning.py
qlearning_policy.txt		qlearning_policy.txt
qvalue (copy).txt		qvalue (copy).txt
qvalue.txt		qvalue.txt
rewards.txt		rewards.txt
robot_grid.py		robot_grid.py
robotexpectedreward.txt		robotexpectedreward.txt
robotpolicy.txt		robotpolicy.txt
robotpolicy2.txt		robotpolicy2.txt
robotpolicy_E_coarse2T		robotpolicy_E_coarse2T
robotpolicy_E_fine		robotpolicy_E_fine
robotpolicy_E_fine2		robotpolicy_E_fine2
robotpolicy_E_fine3		robotpolicy_E_fine3
robotpolicy_E_fine5		robotpolicy_E_fine5
robotpolicybigrid.txt		robotpolicybigrid.txt
robotpolicycoarsegrid.txt		robotpolicycoarsegrid.txt
robotpolicyfinegrid.txt		robotpolicyfinegrid.txt
robotpolicyfinestgrid.txt		robotpolicyfinestgrid.txt
targets.txt		targets.txt
truebeliefMDP.txt		truebeliefMDP.txt

Repository files navigation

List of the files:

digraph.py: Generates a graph object for use in MDP class. Contains graph methods such as computing strongly connected components, and sub-graphs etc.

mdp.py: Includes functions to construct non-deterministic finite automaton, sample next transition in a MDP, and computing best action in a state.

robot_grid.py: Generates the abstract MDP from discretizing the dynamics and implements the q-learning with different shields. The shield we used in the experiments require installation of a tool named SLUGS. You can find a link to that tool here: https://github.com/VerifiableRobotics/slugs

In this file, we have a similar shield that is generated by slugs to make sure that it runs without installing the tool.

Also included in this README are the following files for the Nao:
LocalizationModule.cpp: Includes the changes to the beacon positions for the custom field in Figure 1.
LocalizationModule.h: Header for above.

Field_Q.py: Performs the behavior of locating the robot's place in the world and calling the corresponding action located in cfgpolicy_q_learned.py. This is the shielded policy with q-learning.
cfgpolicy_q_learned.py: The policy, represented as a dictionary of state-space tuples, generated from the shielded q-learning method.

Field.py: Performs a similiar behavior as Field_Q.py but the policy (cfgpolicy.py) is from the unshielded q-learning method.
cfgpolicy.py: The policy, represented as a dictionary of state-space tuples, generate from the unshielded q-learning method.