PortadaGruposCharlasMásPanorama actual
Buscar en el sitio
Este sitio utiliza cookies para ofrecer nuestros servicios, mejorar el rendimiento, análisis y (si no estás registrado) publicidad. Al usar LibraryThing reconoces que has leído y comprendido nuestros términos de servicio y política de privacidad. El uso del sitio y de los servicios está sujeto a estas políticas y términos.

Resultados de Google Books

Pulse en una miniatura para ir a Google Books.

Cargando...

Regret analysis of stochastic and nonstochastic multi-armed bandit problems

por S. Bastian Bubeck

MiembrosReseñasPopularidadValoración promediaConversaciones
4Ninguno3,444,617NingunoNinguno
A multi-armed bandit problem - or, simply, a bandit problem - is a sequential allocation problem defined by a set of actions. At each time step, a unit resource is allocated to an action and some observable payoff is obtained. The goal is to maximize the total payoff obtained in a sequence of allocations. The name bandit refers to the colloquial term for a slot machine (a "one-armed bandit" in American slang). In a casino, a sequential allocation problem is obtained when the player is facing many slot machines at once (a "multi-armed bandit"), and must repeatedly choose where to insert the next coin. Multi-armed bandit problems are the most basic examples of sequential decision problems with an exploration-exploitation trade-off. This is the balance between staying with the option that gave highest payoffs in the past and exploring new options that might give higher payoffs in the future. Although the study of bandit problems dates back to the 1930s, exploration-exploitation trade-offs arise in several modern applications, such as ad placement, website optimization, and packet routing. Mathematically, a multi-armed bandit is defined by the payoff process associated with each option. In this book, the focus is on two extreme cases in which the analysis of regret is particularly simple and elegant: independent and identically distributed payoffs and adversarial payoffs. Besides the basic setting of finitely many actions, it also analyzes some of the most important variants and extensions, such as the contextual bandit model. This monograph is an ideal reference for students and researchers with an interest in bandit problems.… (más)
Añadido recientemente porDawnDrain
Ninguno
Cargando...

Inscríbete en LibraryThing para averiguar si este libro te gustará.

Actualmente no hay Conversaciones sobre este libro.

Ninguna reseña
sin reseñas | añadir una reseña
Debes iniciar sesión para editar los datos de Conocimiento Común.
Para más ayuda, consulta la página de ayuda de Conocimiento Común.
Título canónico
Título original
Títulos alternativos
Fecha de publicación original
Personas/Personajes
Lugares importantes
Acontecimientos importantes
Películas relacionadas
Epígrafe
Dedicatoria
Primeras palabras
Citas
Últimas palabras
Aviso de desambiguación
Editores de la editorial
Blurbistas
Idioma original
DDC/MDS Canónico
LCC canónico

Referencias a esta obra en fuentes externas.

Wikipedia en inglés

Ninguno

A multi-armed bandit problem - or, simply, a bandit problem - is a sequential allocation problem defined by a set of actions. At each time step, a unit resource is allocated to an action and some observable payoff is obtained. The goal is to maximize the total payoff obtained in a sequence of allocations. The name bandit refers to the colloquial term for a slot machine (a "one-armed bandit" in American slang). In a casino, a sequential allocation problem is obtained when the player is facing many slot machines at once (a "multi-armed bandit"), and must repeatedly choose where to insert the next coin. Multi-armed bandit problems are the most basic examples of sequential decision problems with an exploration-exploitation trade-off. This is the balance between staying with the option that gave highest payoffs in the past and exploring new options that might give higher payoffs in the future. Although the study of bandit problems dates back to the 1930s, exploration-exploitation trade-offs arise in several modern applications, such as ad placement, website optimization, and packet routing. Mathematically, a multi-armed bandit is defined by the payoff process associated with each option. In this book, the focus is on two extreme cases in which the analysis of regret is particularly simple and elegant: independent and identically distributed payoffs and adversarial payoffs. Besides the basic setting of finitely many actions, it also analyzes some of the most important variants and extensions, such as the contextual bandit model. This monograph is an ideal reference for students and researchers with an interest in bandit problems.

No se han encontrado descripciones de biblioteca.

Descripción del libro
Resumen Haiku

Debates activos

Ninguno

Cubiertas populares

Enlaces rápidos

Valoración

Promedio: No hay valoraciones.

¿Eres tú?

Conviértete en un Autor de LibraryThing.

 

Acerca de | Contactar | LibraryThing.com | Privacidad/Condiciones | Ayuda/Preguntas frecuentes | Blog | Tienda | APIs | TinyCat | Bibliotecas heredadas | Primeros reseñadores | Conocimiento común | 205,879,425 libros! | Barra superior: Siempre visible