Have you ever been training a spot or hand with the PLO Trainer (more precisely, preflop and multi-way) and wondered or thought "why do the hands here seem so strange, this doesn't make any sense, why does the trainer/solver want me to take this action in this spot with this hand"?

It's likely due to a convergence issue. When a simulation is first setup, it assigns each player a uniform random strategy and computes the EV loss against each other players current strategy for each hand (and flop, turn, river) throughout the game tree which is quite complex. As the strategy changes for each player, the solver re-calculates the best strategy for a specific part of the game tree and repeats until it finds the optimal solution.

Expected Value (EV) and Decision Accuracy

The PLO Trainer determines which play is correct based on the highest frequency of occurrence in a simulated scenario. However, the EV displayed for a specific move might occasionally appear higher than another move labeled as correct. This can happen due to incomplete convergence of simulations—where not enough iterations have been run to fully align results across all hands in the range. Displayed results typically stabilize as the simulation progresses toward complete convergence.

These constant re-calculations are called iterations - and the number of iterations required varies depending on many factors - primarily the number of players in the simulation, and the stack size of each player compared to the size of the pot. Solvers also take rake into account (along with dead money such as straddles, antes, and more), which is how it comes to a conclusion on the optimal strategy with a specific hand (or buckets of hands).

Red and Blue Fold Percentages

Fold percentages, represented by red and blue bars, reflect your entire range of possible hands. For instance, in some scenarios, the blue fold line might indicate that you’re folding 89% of your total range. While this percentage provides insight into your strategy at the range level, it doesn’t necessarily determine the correctness of a specific move, such as a pot-size bet. Decision validity depends on the broader factors influencing that scenario.

The primary reason that players see an action that leaves them questioning what is going on, is because to a solver this spot simply doesn't happen. Largely due to the ranges in play combined with rake - certain parts of the game tree just wouldn't happen. For example, you are playing a 2/5 8-Max live PLO game with everyone at 100bb. After many factors - an early position opening range only consists of 12.2% of hands - in PLO, there are 270,725 starting hand combinations. This means there are roughly 33,029 starting hands that are profitable (or +EV) to open from an early position in this game.

The next player to act (Early Position +1 or Under The Gun/UTG+1) only has a calling frequency of 6.3% when the simulation is run - meaning that roughly 17,056 hands are a profitable (or +EV) call when facing the previous action. In reality - poker players and especially live poker players are going to be a lot looser in this spot. They call with hands that aren't profitable given multiple factors - optimal ranges in play, stack sizes, rake, position and more.

As you go deeper and deeper into the game tree - with players taking actions that happen less and less frequently, these spots are not converged. Simply put, in a human environment this specific game tree could happen - but to a solver after millions of calculations someone should never be in this spot. Given enough computing power and running a simulation for long enough in theory would produce an accurate result - but this just isn't realistic.

Importance of Convergence in Simulation Results

The quality and reliability of the PLO Trainer outputs depend significantly on the convergence of its underlying simulations. Convergence refers to the state where simulations have iterated enough to adequately reflect optimal play across all hands within a range. When convergence is incomplete, some EV or frequency discrepancies might arise, but these typically affect only a small percentage of hands. Users should approach results with these considerations in mind, especially when dealing with advanced scenarios.

Hands will appear in mixed frequencies (meaning that a player would take a specific action x% of the time) - but given long enough these results would eventually converge to either 100% or 0%.

Hand Categorization and Range Decisions

In specific cases, the PLO Trainer analyzes how a hand fits into the broader strategy for a player’s range. For instance, a breakdown might show 51% of your total hand range opens while 49% folds, reflecting overall tendencies. Within specific hand groups, such as Unpaired Single Suited Disconnected hands, the folding percentages might differ. This highlights the importance of considering decisions within the broader range strategy context rather than isolating individual hands.

It's important to utilize critical thinking in these spots and understand which hands make sense to continue in a certain part of the game tree while understanding how a solver comes to a conclusion for each specific hand.

Decision-Making at the Range Level

Instead of focusing solely on individual hand results, players can use range-level data to better understand their overall strategy and tendencies. By interpreting red and blue bars and adjusting gameplay to align with a balanced range approach, users can achieve more consistent and informed decision-making.

The Challenges of "solving" PLO

What is Convergence?