Blog

Posts grouped by series.

Energy

Fundamentals of Thermodynamics

2025-12-16

Thermodynamics is the study of energy, heat, work, and how they interrelate within physical systems. It provides a macroscopic framework for understanding the behavior of matter based on a few fundamental principles.

Hello, World

2025-12-09

A short test post to verify the blog plumbing with Axum and Askama.

Temporal-Difference Learning

2025-12-19

Temporal-Difference (TD) learning is a reinforcement learning approach tries to improve esimates based in part on other learned estimates, without waiting for a final outcome (as in Monte Carlo methods).
Monte Carlo Methods in Reinforcement Learning

2025-12-18

Monte Carlo methods are a class of algorithms that rely on repeated random sampling to obtain numerical results. In reinforcement learning, they are used to estimate the value of states or actions based on the average return observed from multiple episodes.
Dynamic Programming for Reinforcement Learning

2025-12-16

Dynamic programming (DP) is a collection of algorithms that can be used to solve reinforcement learning problems when a perfect model of the environment is known. DP methods break down problems into smaller subproblems and solve them recursively.
The Bellman Equation

2025-12-15

The Bellman equation is a fundamental recursive relationship in dynamic programming and reinforcement learning that expresses the value of a decision problem at a certain point in time in terms of the value at subsequent points in time.
Multi-Armed Bandits

2025-12-12

A bandit problem models sequential decision-making under uncertainty, where every step an agent selects one action (an arm), receives a scalar reward, and updates its belief about that arm.