Python get_quantile_at_action 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: rljax.util

메소드/함수: get_quantile_at_action

hotexamples.com에서의 예제들: 3

Python get_quantile_at_action - 3개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 rljax.util.get_quantile_at_action에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

파일 보기

파일: fqf.py 프로젝트: winston-ds/rljax

 def _loss_cum_p(self, params_cum_p, params, state, action):
     feature = jax.lax.stop_gradient(self.net["feature"].apply(params["feature"], state))
     cum_p, cum_p_prime = self.cum_p_net.apply(params_cum_p, feature)
     quantile = get_quantile_at_action(self.net["quantile"].apply(params["quantile"], feature, cum_p[:, 1:-1]), action)
     quantile_prime = get_quantile_at_action(self.net["quantile"].apply(params["quantile"], feature, cum_p_prime), action)
     # NOTE: Proposition 1 in the paper requires F^{-1} is non-decreasing. I relax this requirements and
     # calculate gradients of taus even when F^{-1} is not non-decreasing.
     val1 = quantile - quantile_prime[:, :-1]
     sign1 = quantile > jnp.concatenate([quantile_prime[:, :1], quantile[:, :-1]], axis=1)
     val2 = quantile - quantile_prime[:, 1:]
     sign2 = quantile < jnp.concatenate([quantile[:, 1:], quantile_prime[:, -1:]], axis=1)
     grad = jnp.where(sign1, val1, -val1) + jnp.where(sign2, val2, -val2)
     grad = jax.lax.stop_gradient(grad.reshape(-1, self.num_quantiles - 1))
     return (cum_p[:, 1:-1] * grad).sum(axis=1).mean(), None

예제 #2

파일 보기

파일: fqf.py 프로젝트: winston-ds/rljax

 def _calculate_value(
     self,
     params: hk.Params,
     feature: np.ndarray,
     action: np.ndarray,
     cum_p: jnp.ndarray,
 ) -> jnp.ndarray:
     return get_quantile_at_action(self.net["quantile"].apply(params["quantile"], feature, cum_p), action)

예제 #3

파일 보기

 def _calculate_value(
     self,
     params: hk.Params,
     state: np.ndarray,
     action: np.ndarray,
     *args,
     **kwargs,
 ) -> jnp.ndarray:
     return get_quantile_at_action(self.net.apply(params, state, *args, **kwargs), action)