書名: Python Reinforcement Learning作者名: Sudharsan Ravichandiran Sean Saito Rajalingappaa Shanmugamani Yang Wenzhuo本章字?jǐn)?shù): 30字更新時(shí)間: 2021-06-24 15:17:33
Solving the Bellman equation
We can find the optimal policies by solving the Bellman optimality equation. To solve the Bellman optimality equation, we use a special technique called dynamic programming.
推薦閱讀
- 數(shù)據(jù)庫(kù)應(yīng)用實(shí)戰(zhàn)
- 企業(yè)數(shù)字化創(chuàng)新引擎:企業(yè)級(jí)PaaS平臺(tái)HZERO
- 虛擬化與云計(jì)算
- Architects of Intelligence
- 分布式數(shù)據(jù)庫(kù)系統(tǒng):大數(shù)據(jù)時(shí)代新型數(shù)據(jù)庫(kù)技術(shù)(第3版)
- 揭秘云計(jì)算與大數(shù)據(jù)
- Enterprise Integration with WSO2 ESB
- Creating Dynamic UIs with Android Fragments(Second Edition)
- 中國(guó)數(shù)字流域
- 智能數(shù)據(jù)時(shí)代:企業(yè)大數(shù)據(jù)戰(zhàn)略與實(shí)戰(zhàn)
- Power BI商業(yè)數(shù)據(jù)分析完全自學(xué)教程
- MySQL技術(shù)內(nèi)幕:SQL編程
- 貫通SQL Server 2008數(shù)據(jù)庫(kù)系統(tǒng)開發(fā)
- Unreal Engine Virtual Reality Quick Start Guide
- 企業(yè)主數(shù)據(jù)管理實(shí)務(wù)