feat(demo): Add step-by-step DQN GridWorld demo with training and visualization #139
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
DQN GridWorld Demo Notebook with Visualization
This PR adds a step-by-step Markdown notebook demonstrating how to set up, train, and visualize a Deep Q-Network (DQN) agent in Fehu using a simple GridWorld environment.
Features:
Visualizations:
Visualizations in the Demo
Episode Rewards
This plot shows the total reward per episode.
A rising or stable reward curve indicates successful learning.
Episode Length
This plot shows the number of steps taken in each episode.
A decreasing or stable episode length indicates the agent is learning to reach the goal more efficiently.
Loss Curve
This plot shows the DQN loss over episodes.
A decreasing loss suggests the Q-network is learning to predict better action values.
Epsilon Schedule
This plot shows the epsilon value used for exploration.
Epsilon decays over time, meaning the agent explores less and exploits more as training progresses.
Average Q-value
This plot shows the average Q-value per episode.
Tracking average Q helps diagnose learning stability and value estimation quality.
Implementation Notes
References