Externaly defined system

RSTT Tutorial 2 - Integration

In this notebook we will use the openskill rating system with RSTT. The goal is to wrapp model in a Ranking class to benefit from its functionnalities and fit in simulation. We will also use model predictions to generate games outcome.

1. RSTT Ranking Design

A Ranking is a composition over inheritance design that contains: - A Standing: dict/list hybrid container. Automaticaly sorts player based on their ranking point. - A RatingSystem: dict like container that maps player with ratings - An Inference: provides a .rate method() to compute ratings - An Observer: provides a .handle_observations() method that process ranking.update inputs.

Before integrating external system, lets start with a simple illustration. A ranking can be instanciated with its components specified. However, we recommand to represent a ranking design in its own class. It makes it more clear what parameters are intresect to the ranking design, and which are hyper-parameters.

1.1 Instanciate with Components

A ranking can be instanciated with its components specified. NOT RECOMMANDED

from rstt import Ranking
from rstt.ranking import KeyModel, Elo, GameByGame

# Ambiguity between core design elements and parameters. Is the handler a tunable parameter of the ranking?
Ranking(name='elo', datamodel=KeyModel(default=1000), backend=Elo(k=20), handler=GameByGame())
<rstt.ranking.ranking.Ranking at 0x11326b9e0>

1.2 Class Design

We recommand to represent a ranking design in its own class with an explicit naming. It makes it more clear what parameters are inherent to the ranking design, and which are tunable hyper-parameters for comparative studies.

# Distinguish core design from parameters, handler is not a parameter.
class EloGBG(Ranking):
    def __init__(self, name: str, default_rating: float=1000, k: float=20):
        # The standing component provided in the super() init.
        super().__init__(name=name,
                         datamodel=KeyModel(default=default_rating), # RatingSystem
                         backend=Elo(k=k), # Inference
                         handler=GameByGame()) # Observer

1.3 Run illustration

As you can see, there is not much to do and it works just fine in simulation. The RSTT built-in BasicElo class code is in fact very similar. All ranking’s functionalities are implemented at a higher level of abstraction and relies on minimal requirements from its components to work as intended.

from rstt import Player, RoundRobin, LogSolver

# our ranking design
elo = EloGBG('elo')

# players
population = Player.create(nb=32)

# play games - ranking used as seeding
tournament = RoundRobin('test', elo, LogSolver())
tournament.registration(population)
tournament.run()

# check if update works
elo.update(games=tournament.games())
elo.plot()
----------- elo -----------
   0.       Tiffany Vinson       1210
   1.           Gary Young       1196
   2.        Javier Henson       1192
   3.     Antionette Welsh       1145
   4.         Michael Mora       1130
   5.        Joseph Austin       1130
   6.       Shannon Monroe       1123
   7.      Timothy Hubbard       1120
   8.           Tim Cramer       1060
   9.        Theresa Doyle       1055
  10.         Linda Aberle       1047
  11.        Matthew Salas       1038
  12.        Beulah Mcgill       1031
  13.      Tamica Martinez       1031
  14.         Nancy Valdez       1027
  15.        Charles Tracy       1015
  16.        Donald Hauger        977
  17.         Anthony Tong        973
  18.        Stacie Parker        965
  19.         Billy Hughes        941
  20.          Susan Lesko        935
  21.        Betty Mehling        932
  22.       Richard Rosado        931
  23.         Debra Ferris        924
  24.     Howard Osterberg        921
  25.       Donald Nuttall        900
  26.         James Vasher        870
  27.        Tracy Cordova        861
  28.       Lorraine Walls        848
  29.          Peggy Smith        842
  30.         Megan Hinton        820
  31.        Joanne Patton        794

2. Use OpenSkill in RSTT

Openskill is an Inference system according to RSTT terminology. On Github, it encourages to drop TrueSkill and Elo. So … lets test it!

2.1 Ranking.datamodel: stypes.RatingSystem

It acts as a container of rating object. It must provide get and set method for player’s rating. It also provides a float interpretation of rating with an ordinal funciton. Lets first take a look at openskill rating.

from openskill.models import PlackettLuce

model = PlackettLuce()
rating = model.rating()
print('Rating data - mu:', rating.mu, 'sigma:', rating.sigma, 'name:', rating.name, 'id:', rating.id)
Rating data - mu: 25.0 sigma: 8.333333333333334 name: None id: 3db7d7a810ec4ea48d778f70bdfe652b

2.2 KeyModel, a general purpose RatingSystem

The KeyModel class is a base class for the RatingSystem protocol (see elo example). It provides all features needed and just require you to provide a default rating (for player that do not have one yet).

There are 3 way to specify a default rating - by providing a value: default = model.rating() - by providing a constructor: template = model.rating - by providing a function which takes as input the player for which a rating is created: factory = lambda player: …

In the case of openskill, since rating do contain an id, it is better to avoid the default approach. The template is an option, but since rating have names, why not make it match the one player.name()? Let us use the factory approach.

KeyModel has a basic ordinal implementation that will not work here. We need to overide it.

from rstt.ranking import KeyModel

class OSRatings(KeyModel):
    def __init__(self, model, mu=None, sigma=None):
        # the first parameter of the factory is always the player getting a rating
        super().__init__(factory=lambda x, **kwargs: model.rating(name=x.name(), **kwargs), mu=mu, sigma=sigma)

    def ordinal(self, rating) -> float:
        # openskill ratings have an ordinal functionality themself - easy !
        return rating.ordinal()

osr = OSRatings(PlackettLuce(), mu=40, sigma=5)
rating = osr.get(Player('dummy'))
print(rating)
Plackett-Luce Player Data:

id: 83c17cdd2b90447a8ae5fc350375410c
name: dummy
mu: 40
sigma: 5

2.2 Ranking.backend: stypes.Inference

Inference is defined as a Protocol and typechecked in the RSTT package. Anything that provide a .rate() method fits the bill. Openskill.models have all a .rate method thus are RSTT.stypes.Inference and can directly be passed to a ranking class as backend. Nothing to do. Cool!

This is not always the case. You can however write a simple class with a rate method that wrapps the rate process of a system to intergrate.

2.3 Ranking.handler: stypes.Observer

The handler.handle_observations() method is called by the ranking.forward() during the ranking.update() execution.

  • Ranking.update is a user level functionnality that should NEVER be override.

  • Ranking.forward is a develloper functionnality. It CAN be override, usualy not necessary.

  • Observer.handle_observations is a complete workflow from the update input to the new ranking state.

In a majority of cases, the handle_observations perform the following steps: 1) Format the update inputs. The inputs are referred as ‘observations’. They justify a change of ranking state. 2) Extract from the observations the relevant information 3) Query the datamodel for the corresponding prior ratings 4) Call the backend.rate method with correct arguments 5) Interpret the backend.rate return values 6) Push the posteriori ratings to the datamodel

We want to input a list of RSTT.stypes.SMatch. We already have workedk on the ratings in the datamodel. We need to extract relevant data from games. So we need to know what to pass to the rate method. Lets have a look at its signature.

import inspect
inspect.getfullargspec(model.rate).annotations
{'return': typing.List[typing.List[openskill.models.weng_lin.plackett_luce.PlackettLuceRating]],
 'teams': typing.List[typing.List[openskill.models.weng_lin.plackett_luce.PlackettLuceRating]],
 'ranks': typing.Optional[typing.List[float]],
 'scores': typing.Optional[typing.List[float]],
 'tau': typing.Optional[float],
 'limit_sigma': typing.Optional[bool]}

TODO: Your Task is to read the Observer code and try to identify the 6 steps.

from rstt.stypes import RatingSystem, Inference, SMatch

class OSHandler:
    def handle_observations(self, datamodel: RatingSystem, infer: Inference, games: list[SMatch]):
        for game in games:
            # extract game info
            teams_of_players = game.teams()
            scores = game.scores() # alternative: ranks = game.ranks()

            # get corresponding rating from datamodel
            teams = [] # list[list[rating]]
            for team in teams_of_players:
                ratings = [] # list[rating]
                for player in team:
                    ratings.append(datamodel.get(player))
                teams.append(ratings)

            # call rate
            new_ratings = infer.rate(teams=teams, scores=scores) # or ..., ranks=ranks)

            # push new ratings
            for team, ratings in zip(teams_of_players, new_ratings):
                for player, rating in zip(team, ratings):
                    datamodel.set(player, rating)

ANSWER

  • step1: no formating, if the user does not pass a list of games, the observer will not work

  • step2: games.teams() and games.scores()

  • step3: datamodel.get() calls

  • step4: infer.rate() call

  • step5: the for … in zip(…) matches the output of the rate method with the adequate players in simulations

  • step6: datamodel.set() calls

2.4 Run illustration

The OpenSkill Ranking class will take one single parameter, an openskill.models object. And then it is ready to be used.

# Openskill class
class OpenSkill(Ranking):
    def __init__(self, name: str, model):
        super().__init__(name=name, datamodel=OSRatings(model), backend=model, handler=OSHandler())

# OS Instance
os = OpenSkill('OpenSkill', model)

# OS update on rstt simulated games
os.update(games=tournament.games())

Remark: RSTT provides an OpenSkill ranking wrapper - BasicOS - which is not exactly implemented as present in the tutorials, but works similary. You still need to import Openskill and pass a model yourself.

3. Ranking functionality

This is now openskill on steroïds. You can access playesr by ranks, get rating of a player. You can use it to seed competition like a single elimination bracket. Lets start by a standard output plot of the standing.

os.plot()
----------- OpenSkill -----------
   0.       Tiffany Vinson         35
   1.           Gary Young         33
   2.        Javier Henson         31
   3.     Antionette Welsh         27
   4.         Michael Mora         26
   5.        Joseph Austin         24
   6.      Timothy Hubbard         24
   7.       Shannon Monroe         24
   8.           Tim Cramer         17
   9.         Linda Aberle         16
  10.        Theresa Doyle         16
  11.        Beulah Mcgill         16
  12.        Matthew Salas         15
  13.      Tamica Martinez         14
  14.        Charles Tracy         13
  15.         Nancy Valdez         12
  16.        Donald Hauger          9
  17.         Anthony Tong          8
  18.        Stacie Parker          7
  19.         Billy Hughes          5
  20.          Susan Lesko          4
  21.        Betty Mehling          3
  22.       Richard Rosado          3
  23.         Debra Ferris          3
  24.     Howard Osterberg          2
  25.       Donald Nuttall          0
  26.         James Vasher         -2
  27.        Tracy Cordova         -4
  28.       Lorraine Walls         -6
  29.          Peggy Smith         -8
  30.         Megan Hinton        -10
  31.        Joanne Patton        -15

3.1 Rank Correlation

RSTT ranking interface simplifies some metrics compuation, like rank correlation. The advantage of simulation is that you have a baseline to comupte it. Lets compare elo and openskill to the simulation model.

from scipy import stats
from rstt import BTRanking

# ranking where players ratings are their respectives level().
gt = BTRanking('consensus', population)

print('OpenSkill - GroundTRuth correlation: \n  ', stats.kendalltau(gt[population], os[population]))
print('Elo - GroundTRuth correlation: \n  ', stats.kendalltau(gt[population], elo[population]))
print('OpenSkill - Elo correlation: \n  ', stats.kendalltau(elo[population], os[population]))
OpenSkill - GroundTRuth correlation:
   SignificanceResult(statistic=np.float64(0.8508064516129034), pvalue=np.float64(8.187631748655122e-17))
Elo - GroundTRuth correlation:
   SignificanceResult(statistic=np.float64(0.866935483870968), pvalue=np.float64(7.496744126671432e-18))
OpenSkill - Elo correlation:
   SignificanceResult(statistic=np.float64(0.9838709677419356), pvalue=np.float64(3.9371288142144177e-31))

3.2 Ranking state as simulation parameter

You can easly play arround with the inital state of any RSTT ranking by provding an arbitrary ordering of the players involved.

import random

# random ordering
seeds = list(range(len(os)))
random.shuffle(seeds)

print(list(range(len(os))))
print(seeds)
print('Seeds - Truth correlation:', stats.kendalltau(seeds, list(range(len(os)))).statistic)

# reordering
elo.rerank(seeds)
os.rerank(seeds)

print('OpenSkill - GroundTRuth correlation:', stats.kendalltau(gt[population], os[population]).statistic)
print('Elo - GroundTRuth correlation:', stats.kendalltau(gt[population], elo[population]).statistic)
print('OpenSkill - Elo correlation:', stats.kendalltau(elo[population], os[population]).statistic)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]
[14, 10, 28, 3, 27, 26, 16, 17, 23, 22, 4, 0, 2, 13, 1, 11, 18, 30, 12, 24, 29, 20, 15, 31, 7, 8, 6, 21, 5, 9, 19, 25]
Seeds - Truth correlation: -0.00806451612903226
OpenSkill - GroundTRuth correlation: -0.00403225806451613
Elo - GroundTRuth correlation: -0.00403225806451613
OpenSkill - Elo correlation: 0.7822580645161291

3.3 Control the Interplay between a Ranking and a Dataset

Now it is possible to select players and seed them in a competition based on their openskill ratings.

from rstt import SwissRound

# reordered openskill ranking as seeding
t2 = SwissRound(name='OpensKill seeded tournament', seeding=os, solver=LogSolver())

# top 16 players according to openskill
t2.registration(os[:16])
t2.run()

os.update(games=t2.games())
elo.update(games=t2.games())

3.4 Fancy Analysis

Let see what changed. Keep in mind that we atrificialy altered the entire ranking state, but only half of the players where involved in the new dataset.

print('-- Kendalltau rank correaltion on the entire population --')
print('OpenSkill - GroundTRuth correlation:', stats.kendalltau(gt[population], os[population]).statistic)
print('Elo - GroundTRuth correlation:', stats.kendalltau(gt[population], elo[population]).statistic)
print('OpenSkill - Elo correlation:', stats.kendalltau(elo[population], os[population]).statistic)

print('\n -- Kendalltau rank correaltion on the real top16 --')
top16 = gt[:16]
print('OpenSkill - GroundTRuth correlation:', stats.kendalltau(gt[top16], os[top16]).statistic)
print('Elo - GroundTRuth correlation:', stats.kendalltau(gt[top16], elo[top16]).statistic)
print('OpenSkill - Elo correlation:', stats.kendalltau(elo[top16], os[top16]).statistic)

print('\n -- Kendalltau rank correaltion on the \'openskill prio\' top16 --')
seed16 = t2.participants()
print('OpenSkill - GroundTRuth correlation:', stats.kendalltau(gt[seed16], os[seed16]).statistic)
print('Elo - GroundTRuth correlation:', stats.kendalltau(gt[seed16], elo[seed16]).statistic)
print('OpenSkill - Elo correlation:', stats.kendalltau(elo[seed16], os[seed16]).statistic)
-- Kendalltau rank correaltion on the entire population --
OpenSkill - GroundTRuth correlation: 0.08064516129032259
Elo - GroundTRuth correlation: 0.08467741935483872
OpenSkill - Elo correlation: 0.7943548387096775

 -- Kendalltau rank correaltion on the real top16 --
OpenSkill - GroundTRuth correlation: 0.0
Elo - GroundTRuth correlation: 0.0
OpenSkill - Elo correlation: 0.6333333333333333

 -- Kendalltau rank correaltion on the 'openskill prio' top16 --
OpenSkill - GroundTRuth correlation: 0.26666666666666666
Elo - GroundTRuth correlation: 0.15
OpenSkill - Elo correlation: 0.5166666666666667

4 OpenSkill as Solver

A Solver is anything that provide a solve() method. It is used to assign a Score to SMatch. Because OpenSkill has methods to predict game outcome, it could be used has a solver. Below is an example for Duel confrontation. we are extending the Solver ScoreProb which generate game outcome based on a score probability.

from rstt.solver.solvers import ScoreProb, WIN, LOSE
from rstt import Duel

import random

# OpenSkill Solver
class OSS(ScoreProb):
    def __init__(self, os: OpenSkill):
        self.model = os.backend
        self.ratings = os.datamodel

        # NOTE: WIN is an alias for player1 wins; LOSE if an alias for player1 lose, i.e player2 wins
        super().__init__(scores=[WIN, LOSE], func=self.predict_win)

    def predict_win(self, duel: Duel) -> list[float]:
        # NOTE: when player1 wins, then player2 lose and vice-versa
        return self.model.predict_win([[self.ratings.get(duel.player1())], [self.ratings.get(duel.player2())]])

4.1 Level Coherence

The OSS class does not care about involved player’s level. It needs OpenSkill ratings, which is completely indepandent. Player with high level having less than 0.5 win probability against player with lower level can be confusing. One way to keep the Player base class coherent with the solver is to train the rating on an ideal dataset, one where every player faces each others at least once and the best player wins the encounters. We can use RoundRobin and BetterWin for this purpose.

from openskill.models import BradleyTerryFull
from rstt import BetterWin

# Perfect Data Set
training_set = RoundRobin('Training Set', seeding=gt, solver=BetterWin())
training_set.registration(population)
training_set.run()

# Train OpenSkill -> make meaningfull ratings
os_trained = OpenSkill('OpenSkill as Solver', model=BradleyTerryFull())
os_trained.update(games=training_set.games())

# assert ranking quality
print('OpenSkill - GroundTRuth correlation:', stats.kendalltau(gt[population], os_trained[population]).statistic)
OpenSkill - GroundTRuth correlation: 1.0

4.2 Simulation

And now we can instanciate and run competition sublass by providing OSS as a solver. The game results are generated according to OpenSkill model prediction.

from rstt import SingleEliminationBracket, SwissBracket
from rstt import BasicGlicko

# OpenSkill as Solver
oss = OSS(os_trained)

# test ranking
gl = BasicGlicko('Glicko')
btf = OpenSkill('BTF tested', model=BradleyTerryFull())

# play games using openskill prediction to generate scores
seb = SingleEliminationBracket('Example SEB', seeding=gt, solver=oss)
seb.registration(population)
seb.run()


print('OSS solver defines the truth level - After Single-Elimination-Bracket')
gl.update(games=seb.games())
btf.update(games=seb.games())
print('GroundTRuth - Glicko correlation:', stats.kendalltau(os_trained[population], gl[population]).statistic)
print('GroundTRuth - BTS correlation:', stats.kendalltau(os_trained[population], btf[population]).statistic)

# play games using openskill prediction to generate scores
swb = SwissBracket('Example SwissBracket', seeding=gt, solver=oss)
swb.registration(population[:16])
swb.run()

print('OSS solver defines the truth level - After Swiss-Bracket')
gl.update(games=seb.games())
btf.update(games=swb.games())
print('GroundTRuth - Glicko correlation:', stats.kendalltau(os_trained[population], gl[population]).statistic)
print('GroundTRuth - BTS correlation:', stats.kendalltau(os_trained[population], btf[population]).statistic)
OSS solver defines the truth level - After Single-Elimination-Bracket
GroundTRuth - Glicko correlation: 0.5967741935483871
GroundTRuth - BTS correlation: 0.5967741935483871
OSS solver defines the truth level - After Swiss-Bracket
GroundTRuth - Glicko correlation: 0.4435483870967743
GroundTRuth - BTS correlation: 0.7500000000000001

5. Your Turn - Trueskill

Trueskill also fits the RSTT.stypes.Inference interface with a rate method. You know how to use it now!

6. Your Turn - Real Data

Running rstt ranking on real dataset is not hard? Do you have an idea how to make it work?

That is right. You write an observer! The component that deals with the update input.