Python Q_LinFA 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: pybrain.rl.learners.valuebased.linearfa

클래스/타입: Q_LinFA

hotexamples.com에서의 예제들: 2

Python Q_LinFA - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 pybrain.rl.learners.valuebased.linearfa.Q_LinFA에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

Q_LinFA(2)

_decayLearningRate(1)

learningRate(1)

learningRateDecay(1)

rewardDiscount(1)

예제 #1

파일 보기

파일: xor.py 프로젝트: zyx061212/Kaggle

def runExp(gamma=0, epsilon=0.1, xor=False, lr = 0.02):    
    if xor: 
        print "Attempting the XOR task"
    else:
        print "Attempting the AND task"
        
    task = XORTask()
    task.and_task = not xor
    
    l = Q_LinFA(task.nactions, task.nsenses)
    l.rewardDiscount = gamma
    l.learningRate = lr

    agent = LinFA_QAgent(l)
    agent.epsilon = epsilon
    exp = Experiment(task, agent)    
            
    sofar = 0
    for i in range(30):
        exp.doInteractions(100)
        print exp.task.cumreward - sofar,
        if i%10 == 9: 
            print                
        sofar = exp.task.cumreward          
        l._decayLearningRate()

예제 #2

파일 보기

"""Using the agent found in the xor example, rather than in linearfa.py.

"""
from pybrain.rl.learners.valuebased.linearfa import Q_LinFA
from pybrain.rl.experiments import EpisodicExperiment

from environment import Environment
from tasks import LinearFATileCoding3456BalanceTask
from training import LinearFATraining
from agents import LinFA_QAgent

task = LinearFATileCoding3456BalanceTask()
learner = Q_LinFA(task.nactions, task.outdim)
task.discount = learner.rewardDiscount
agent = LinFA_QAgent(learner)
# The state has a huge number of dimensions, and the logging causes me to run
# out of memory. We needn't log, since learning is done online.
agent.logging = False
agent.learning = True
performance_agent = LinFA_QAgent(learner)
performance_agent.logging = False
performance_agent.greedy = True
performance_agent.epsilon = 0.0
performance_agent.learning = False
experiment = EpisodicExperiment(task, agent)

# TODO PyBrain says that the learning rate needs to decay, but I don't see that
# described in Randlov's paper.
# A higher number here means the learning rate decays slower.
learner.learningRateDecay = 100000
# NOTE increasing this number above from the default of 100 is what got the