Title: | Sequential Pairwise Online Rating Techniques |
---|---|
Description: | Calculates ratings for two-player or multi-player challenges. Methods included in package such as are able to estimate ratings (players strengths) and their evolution in time, also able to predict output of challenge. Algorithms are based on Bayesian Approximation Method, and they don't involve any matrix inversions nor likelihood estimation. Parameters are updated sequentially, and computation doesn't require any additional RAM to make estimation feasible. Additionally, base of the package is written in C++ what makes sport computation even faster. Methods used in the package refers to Mark E. Glickman (1999) <http://www.glicko.net/research/glicko.pdf>; Mark E. Glickman (2001) <doi:10.1080/02664760120059219>; Ruby C. Weng, Chih-Jen Lin (2011) <http://jmlr.csail.mit.edu/papers/volume12/weng11a/weng11a.pdf>; W. Penny, Stephen J. Roberts (1999) <doi:10.1109/IJCNN.1999.832603>. |
Authors: | Dawid Kałędkowski [aut, cre] |
Maintainer: | Dawid Kałędkowski <[email protected]> |
License: | GPL-2 |
Version: | 0.2.0 |
Built: | 2024-11-09 04:23:07 UTC |
Source: | https://github.com/gogonzo/sport |
Bayesian Bradley-Terry
bbt_run( formula, data, r = numeric(0), rd = numeric(0), init_r = 25, init_rd = 25/3, lambda = NULL, share = NULL, weight = NULL, kappa = 0.5 )
bbt_run( formula, data, r = numeric(0), rd = numeric(0), init_r = 25, init_rd = 25/3, lambda = NULL, share = NULL, weight = NULL, kappa = 0.5 )
formula |
formula which specifies the model. RHS Allows only player rating parameter and it should be specified in following manner:
Users can also specify formula in in different way:
|
data |
data.frame which contains columns specified in formula, and
optional columns defined by |
r |
named vector of initial players ratings estimates. If not specified
then |
rd |
rd named vector of initial rating deviation estimates. If not specified
then |
init_r |
initial values for |
init_rd |
initial values for |
lambda |
name of the column in 'data' containing lambda values or one
constant value (eg. |
share |
name of the column in 'data' containing player share in team efforts. It's used to first calculate combined rating of the team and then redistribute ratings update back to players level. Warning - it should be used only if formula is specified with players nested within teams ('player(player|team)'). |
weight |
name of the column in 'data' containing weights values or
one constant (eg. |
kappa |
controls |
A "rating" object is returned:
final_r
named vector containing players ratings.
final_rd
named vector containing players ratings deviations.
r
data.frame with evolution of the ratings and ratings deviations
estimated at each event.
pairs
pairwise combinations of players in analysed events with
prior probability and result of a challenge.
class
of the object.
method
type of algorithm used.
settings
arguments specified in function call.
# the simplest example data <- data.frame( id = c(1, 1, 1, 1), team = c("A", "A", "B", "B"), player = c("a", "b", "c", "d"), rank_team = c(1, 1, 2, 2), rank_player = c(3, 4, 1, 2) ) bbt <- bbt_run( data = data, formula = rank_player | id ~ player(player), r = setNames(c(25, 23.3, 25.83, 28.33), c("a", "b", "c", "d")), rd = setNames(c(4.76, 0.71, 2.38, 7.14), c("a", "b", "c", "d")) ) # nested matchup bbt <- bbt_run( data = data, formula = rank_team | id ~ player(player | team) )
# the simplest example data <- data.frame( id = c(1, 1, 1, 1), team = c("A", "A", "B", "B"), player = c("a", "b", "c", "d"), rank_team = c(1, 1, 2, 2), rank_player = c(3, 4, 1, 2) ) bbt <- bbt_run( data = data, formula = rank_player | id ~ player(player), r = setNames(c(25, 23.3, 25.83, 28.33), c("a", "b", "c", "d")), rd = setNames(c(4.76, 0.71, 2.38, 7.14), c("a", "b", "c", "d")) ) # nested matchup bbt <- bbt_run( data = data, formula = rank_team | id ~ player(player | team) )
Dynamic Bayesian Logit
dbl_run( formula, data, r = NULL, rd = NULL, lambda = NULL, weight = NULL, kappa = 0.95, init_r = 0, init_rd = 1 )
dbl_run( formula, data, r = NULL, rd = NULL, lambda = NULL, weight = NULL, kappa = 0.95, init_r = 0, init_rd = 1 )
formula |
formula which specifies the model. Unlike other algorithms in the packages (glicko_run, glicko2_run, bbt_run), this method doesn't allow players nested in teams with 'player(player | team)' and user should matchup in formula using 'player(player)'. DBL allows user specify multiple parameters also in interaction with others. |
data |
data.frame which contains columns specified in formula, and
optional columns defined by |
r |
named vector of initial players ratings estimates. If not specified
then |
rd |
rd named vector of initial rating deviation estimates. If not specified
then |
lambda |
name of the column in 'data' containing lambda values or one
constant value (eg. |
weight |
name of the column in 'data' containing weights values or
one constant (eg. |
kappa |
controls |
init_r |
initial values for |
init_rd |
initial values for |
A "rating" object is returned:
final_r
named vector containing players ratings.
final_rd
named vector containing players ratings deviations.
r
data.frame with evolution of the ratings and ratings deviations
estimated at each event.
pairs
pairwise combinations of players in analysed events with
prior probability and result of a challenge.
class
of the object.
method
type of algorithm used.
settings
arguments specified in function call.
# the simplest example data <- data.frame( id = c(1, 1, 1, 1), name = c("A", "B", "C", "D"), rank = c(3, 4, 1, 2), gate = c(1, 2, 3, 4), factor1 = c("a", "a", "b", "b"), factor2 = c("a", "b", "a", "b") ) dbl <- dbl_run( data = data, formula = rank | id ~ player(name) ) dbl <- dbl_run( data = data, formula = rank | id ~ player(name) + gate * factor1)
# the simplest example data <- data.frame( id = c(1, 1, 1, 1), name = c("A", "B", "C", "D"), rank = c(3, 4, 1, 2), gate = c(1, 2, 3, 4), factor1 = c("a", "a", "b", "b"), factor2 = c("a", "b", "a", "b") ) dbl <- dbl_run( data = data, formula = rank | id ~ player(name) ) dbl <- dbl_run( data = data, formula = rank | id ~ player(name) + gate * factor1)
Glicko rating algorithm
glicko_run( data, formula, r = numeric(0), rd = numeric(0), init_r = 1500, init_rd = 350, lambda = numeric(0), share = numeric(0), weight = numeric(0), kappa = 0.5 )
glicko_run( data, formula, r = numeric(0), rd = numeric(0), init_r = 1500, init_rd = 350, lambda = numeric(0), share = numeric(0), weight = numeric(0), kappa = 0.5 )
data |
data.frame which contains columns specified in formula, and
optional columns defined by |
formula |
formula which specifies the model. RHS Allows only player rating parameter and it should be specified in following manner:
Users can also specify formula in in different way:
|
r |
named vector of initial players ratings estimates. If not specified
then |
rd |
rd named vector of initial rating deviation estimates. If not specified
then |
init_r |
initial values for |
init_rd |
initial values for |
lambda |
name of the column in 'data' containing lambda values or one
constant value (eg. |
share |
name of the column in 'data' containing player share in team efforts. It's used to first calculate combined rating of the team and then redistribute ratings update back to players level. Warning - it should be used only if formula is specified with players nested within teams ('player(player|team)'). |
weight |
name of the column in 'data' containing weights values or
one constant (eg. |
kappa |
controls |
A "rating" object is returned:
final_r
named vector containing players ratings.
final_rd
named vector containing players ratings deviations.
r
data.frame with evolution of the ratings and ratings deviations
estimated at each event.
pairs
pairwise combinations of players in analysed events with
prior probability and result of a challenge.
class
of the object.
method
type of algorithm used.
settings
arguments specified in function call.
# the simplest example data <- data.frame( id = c(1, 1, 1, 1), team = c("A", "A", "B", "B"), player = c("a", "b", "c", "d"), rank_team = c(1, 1, 2, 2), rank_player = c(3, 4, 1, 2) ) # Example from Glickman glicko <- glicko_run( data = data, formula = rank_player | id ~ player(player), r = setNames(c(1500.0, 1400.0, 1550.0, 1700.0), c("a", "b", "c", "d")), rd = setNames(c(200.0, 30.0, 100.0, 300.0), c("a", "b", "c", "d")) ) # nested matchup glicko <- glicko_run( data = data, formula = rank_team | id ~ player(player | team) )
# the simplest example data <- data.frame( id = c(1, 1, 1, 1), team = c("A", "A", "B", "B"), player = c("a", "b", "c", "d"), rank_team = c(1, 1, 2, 2), rank_player = c(3, 4, 1, 2) ) # Example from Glickman glicko <- glicko_run( data = data, formula = rank_player | id ~ player(player), r = setNames(c(1500.0, 1400.0, 1550.0, 1700.0), c("a", "b", "c", "d")), rd = setNames(c(200.0, 30.0, 100.0, 300.0), c("a", "b", "c", "d")) ) # nested matchup glicko <- glicko_run( data = data, formula = rank_team | id ~ player(player | team) )
Glicko2 rating algorithm
glicko2_run( formula, data, r = numeric(0), rd = numeric(0), sigma = numeric(0), lambda = NULL, share = NULL, weight = NULL, init_r = 1500, init_rd = 350, init_sigma = 0.05, kappa = 0.5, tau = 0.5 )
glicko2_run( formula, data, r = numeric(0), rd = numeric(0), sigma = numeric(0), lambda = NULL, share = NULL, weight = NULL, init_r = 1500, init_rd = 350, init_sigma = 0.05, kappa = 0.5, tau = 0.5 )
formula |
formula which specifies the model. RHS Allows only player rating parameter and it should be specified in following manner:
Users can also specify formula in in different way:
|
data |
data.frame which contains columns specified in formula, and
optional columns defined by |
r |
named vector of initial players ratings estimates. If not specified
then |
rd |
rd named vector of initial rating deviation estimates. If not specified
then |
sigma |
(only for glicko2) named vector of initial players ratings
estimates. If not specified then |
lambda |
name of the column in 'data' containing lambda values or one
constant value (eg. |
share |
name of the column in 'data' containing player share in team efforts. It's used to first calculate combined rating of the team and then redistribute ratings update back to players level. Warning - it should be used only if formula is specified with players nested within teams ('player(player|team)'). |
weight |
name of the column in 'data' containing weights values or
one constant (eg. |
init_r |
initial values for |
init_rd |
initial values for |
init_sigma |
initial values for |
kappa |
controls |
tau |
The system constant. Which constrains the change in volatility over
time. Reasonable choices are between 0.3 and 1.2 ( |
A "rating" object is returned:
final_r
named vector containing players ratings.
final_rd
named vector containing players ratings deviations.
final_sigma
named vector containing players ratings volatile.
r
data.frame with evolution of the ratings and ratings deviations
estimated at each event.
pairs
pairwise combinations of players in analysed events with
prior probability and result of a challenge.
class
of the object.
method
type of algorithm used.
settings
arguments specified in function call.
# the simplest example data <- data.frame( id = c(1, 1, 1, 1), team = c("A", "A", "B", "B"), player = c("a", "b", "c", "d"), rank_team = c(1, 1, 2, 2), rank_player = c(3, 4, 1, 2) ) # Example from Glickman glicko2 <- glicko2_run( data = data, formula = rank_player | id ~ player(player), r = setNames(c(1500.0, 1400.0, 1550.0, 1700.0), c("a", "b", "c", "d")), rd = setNames(c(200.0, 30.0, 100.0, 300.0), c("a", "b", "c", "d")) ) # nested matchup glicko2 <- glicko2_run( data = data, formula = rank_team | id ~ player(player | team) )
# the simplest example data <- data.frame( id = c(1, 1, 1, 1), team = c("A", "A", "B", "B"), player = c("a", "b", "c", "d"), rank_team = c(1, 1, 2, 2), rank_player = c(3, 4, 1, 2) ) # Example from Glickman glicko2 <- glicko2_run( data = data, formula = rank_player | id ~ player(player), r = setNames(c(1500.0, 1400.0, 1550.0, 1700.0), c("a", "b", "c", "d")), rd = setNames(c(200.0, 30.0, 100.0, 300.0), c("a", "b", "c", "d")) ) # nested matchup glicko2 <- glicko2_run( data = data, formula = rank_team | id ~ player(player | team) )
Actual dataset containing heats results of all Speedway Grand-Prix turnaments
gpheats
.
A data frame with >19000 rows and 11 variables:
event identifier
year of Grand-Prix, 1995-now
date of turnament
round in season
Turnament name
heat number, 1-23
number of gate, 1-4
rider name, string
paints gained, integer
position at finish line, string
rank at finish line, integer
internal
Actual dataset containing turnament results of all Speedway Grand-Prix events
gpsquads
A data frame with >4000 rows and 9 variables:
event identifier
year of Grand-Prix, 1995-now
date of turnament
stadium of event
round in season
Turnament name
rider names, 1-6
points gained, integer
classification after an event
internal
Plot rating object
## S3 method for class 'rating' plot(x, n = 10, players, ...)
## S3 method for class 'rating' plot(x, n = 10, players, ...)
x |
of class rating |
n |
number of teams to be plotted |
players |
optional vector with names of the contestants (coefficients) to plot their evolution in time. |
... |
optional arguments |
Predict rating model
## S3 method for class 'rating' predict(object, newdata, ...)
## S3 method for class 'rating' predict(object, newdata, ...)
object |
of class rating |
newdata |
data.frame with data to predict |
... |
optional arguments |
probabilities of winning challenge by player over his opponent in all provided events.
glicko <- glicko_run(data = gpheats[1:16, ], formula = rank | id ~ player(rider)) predict(glicko, gpheats[17:20, ])
glicko <- glicko_run(data = gpheats[1:16, ], formula = rank | id ~ player(rider)) predict(glicko, gpheats[17:20, ])
Apply rating algorithm
rating_run( method, data, formula, r = numeric(0), rd = numeric(0), sigma = numeric(0), init_r = numeric(0), init_rd = numeric(0), init_sigma = numeric(0), lambda = numeric(0), share = numeric(0), weight = numeric(0), kappa = numeric(0), tau = numeric(0) )
rating_run( method, data, formula, r = numeric(0), rd = numeric(0), sigma = numeric(0), init_r = numeric(0), init_rd = numeric(0), init_sigma = numeric(0), lambda = numeric(0), share = numeric(0), weight = numeric(0), kappa = numeric(0), tau = numeric(0) )
method |
one of |
data |
data.frame which contains columns specified in formula, and
optional columns defined by |
formula |
formula which specifies the model. RHS Allows only player rating parameter and it should be specified in following manner:
Users can also specify formula in in different way:
|
r |
named vector of initial players ratings estimates. If not specified
then |
rd |
rd named vector of initial rating deviation estimates. If not specified
then |
sigma |
(only for glicko2) named vector of initial players ratings
estimates. If not specified then |
init_r |
initial values for |
init_rd |
initial values for |
init_sigma |
initial values for |
lambda |
name of the column in 'data' containing lambda values or one
constant value (eg. |
share |
name of the column in 'data' containing player share in team efforts. It's used to first calculate combined rating of the team and then redistribute ratings update back to players level. Warning - it should be used only if formula is specified with players nested within teams ('player(player|team)'). |
weight |
name of the column in 'data' containing weights values or
one constant (eg. |
kappa |
controls |
tau |
The system constant. Which constrains the change in volatility over
time. Reasonable choices are between 0.3 and 1.2 ( |
Summarizing rating objects Summary for object of class 'rating'
## S3 method for class 'rating' summary(object, ...)
## S3 method for class 'rating' summary(object, ...)
object |
of class rating |
... |
optional arguments |
List with following elements
formula
modeled formula.
method
type of algorithm used.
Overall Accuracy
named vector containing players ratings.
r
data.frame summarized players ratings and model winning probabilities. Probabilities are returned only in models with one variable (ratings)
name
of a player
r
players ratings
rd
players ratings deviation
`Model probability`
mean predicted probability of winning the challenge by the player.
`True probability`
mean observed probability of winning the challenge by the player.
`Accuracy`
Accuracy of prediction.
`pairings`
number of pairwise occurrences.
model <- glicko_run(formula = rank | id ~ player(rider), data = gpheats[1:102, ]) summary(model)
model <- glicko_run(formula = rank | id ~ player(rider), data = gpheats[1:102, ]) summary(model)