ホーム
研究所について
- 所長挨拶
- 理念と概要
- 組織
- 委員会
- 沿革
- 評価
- 採用情報
- 調達情報
- 情報公開
- 寄附のお願い
- プレスリリース
- 施設紹介
- 創立75周年について
研究活動
- 研究者の紹介
- 研究員・ビジターの受入
  - 統計数理研究所で雇用する特別研究員-PD等の育成方針
  - 外国人ビジター情報
- 研究成果（フリーコンテンツ）
共同利用
刊行物案内
- 学術刊行物
- 広報誌
産学連携
プロジェクト
- プロジェクト
- 体験学習プログラム
大学院教育

第68回統計的機械学習セミナー / The 68th Statistical Machine Learning Seminar (Hybrid)

【Date & Time】: July 24th (Thursday), 2025　16:00 - 17:30
Admission Free
【Place】: Seminar Room 5 (3rd floor), The Institute of Statistical Mathematics
Hybrid :
Please register at the following link and get a Zoom link, if you join by Zoom
https://forms.gle/FzMwmCqgUzf8nPB18

【Speaker】

Wanteng Ma (joint work with T. Tony Cai)
(University of Pennsylvania)

【Title】

Nonparametric Contextual Bandits with Single-Indexed Rewards

【Abstract】

This work studies nonparametric contextual bandits with single-index rewards, where the expected reward of each arm is an unknown nonparametric function of a one-dimensional projection of the covariates. We first estimate this projection direction through a general approach, and then apply plug-in nonparametric regression to yield sharp estimators of the single-index reward functions and thus alleviating the curse of dimensionality. We derive a lower bound that characterizes the fundamental regret limits of single-index bandits and propose a novel algorithm that achieves the minimax-optimal regret rate. Furthermore, we establish a general impossibility result: without additional structure, no policy can adapt to unknown smoothness levels. Nevertheless, under a standard self-similarity condition, we design a policy that remains minimax-optimal while automatically adapting to the unknown smoothness of the reward functions.