Shie Mannor - Publications

The following lists are often out of date or missing, e-mail me for paper with no link. Please note that some links are to almost final versions or technical report versions due to copyright.

Published/Accepted Journal Papers

  1. On the Existence of Weak Learners and Application to Boosting (Machine Learning Journal 48:233-255, 2002), with Ron Meir.

    2003

  2. The Empirical Bayes Envelope and Regret Minimization in Stochastic Games (Mathematics of Operation Research 28(2) 327-345, 2003), with Nahum Shimkin.
  3. Greedy Algorithms for Classification - Consistency, Convergence Rates, and Adaptivity (JMLR, 4(Oct):713-741, 2003), with Ron Meir and Tong Zhang. Online appendix.

    2004

  4. A Geometric Approach to Multi-Criterion Reinforcement Learning (JMLR, 5(Apr):325--360, 2004), with Nahum Shimkin.
  5. The Kernel Recursive Least Squares Algorithm (IEEE Trans. on Signal Processing, 52(8):2275-2285, 2004) with Yaakov Engel and Ron Meir. Online code and documentation.
  6. The Sample Complexity of Exploration in the Multi-Armed Bandit Problem (JMLR, 5(Jun):623--648, 2004) with John N. Tsitsiklis.

    2005

  7. Basis function adaptation in temporal difference reinforcement learning (Annals of Operations Research, 134:1 215-238, 2005) with Ishai Menache and Nahum Shimkin.
  8. A Tutorial on the Cross-Entropy Method (Annals of Operations Research, 134:1 19-67, 2005) with P.T. de Boer, D.P. Kroese, and R.Y. Rubinstein.
  9. On the Empirical State-Action Frequencies in Markov Decision Processes Under General Policies (Mathematics of Operations Research, 30(3):545-561, 2005), with John Tsitsiklis.
  10. Efficiency Loss in a Network Resource Allocation Game: The Case of Elastic Supply (IEEE Transactions on Automatic Control 50 (11): 1712-1724, 2005) with Ramesh Johari and John N. Tsitsiklis.

    2006

  11. Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems (JMLR 7(Jun):1079--1105, 2006) with Eyal Even-dar and Yishay Mansour.
  12. Design of l1-Optimal Controllers with Robustness versus Performance Tradeoff (IEEE Transactions on Automatic Control 51(5):868-873, 2006) with Patrick Cadotte, Hannah Michalska, and Benoit Boulet.
  13. A Contract-Based Model for Directed Network Formation (Games and Economic Behavior 56, 201-224, 2006), with Ramesh Johari and John N. Tsitsiklis.
  14. Stochastic Decoding of LDPC Codes (IEEE Comm. Letters 10(8)716-718, 2006), with Saeed S. Tehrani and Warren J. Gross.
  15. Machine Learning for Adaptive Power Management (Intel Technology Journal, 10:4, November 2006), with Georgios Theocharous, Nilesh Shah, Prashant Gandhi, Branislav Kveton, Sajid Siddiqi and Chih-han Yu.

    2007

  16. Biases and Variance in Value Function Estimates (Management Science 53(2) 308-322, February 2007), with D. Simester, P. Sun, and J. N. Tsitsiklis.
  17. Online Calibrated Forecasts: Efficiency versus Universality for Learning in Games (Machine Learning 67(2) 77-115, 2007), with Jeff Shamma and Gurdal Arslan.
  18. An Inequality for Nearly Log-concave Distributions with Applications to Learning (IEEE Transactions on Inf. Th., 53(3) 1043-1057, 2007), with Constantine Caramanis.
  19. Efficiency of market-based resource allocation among many participants (IEEE JSAC, 25:6 1244-1259, 2007), with Jia-Yuan Yu.
  20. Multi-agent Learning for Engineers (Artificial Intelligence, 171(7): 417-422, 2007), with Jeff Shamma.

    2008
  21. Regret Minimization in Repeated Matrix Games with Variable Stage Duration , (Games and Economic Behavior , in press), with Nahum Shimkin.
  22. Strategies for Prediction under Imperfect Monitoring, (Mathematics of Operations Research, in press) with Gabor Lugosi and Gilles Stoltz.
  23. Approachability in Repeated Games: Computational Aspects and a Stackelberg Variant , (Games and Economic Behavior , in press), with John Tsitsiklis.
  24. A Kalman Filter Design Based on the Performance/Robustness Tradeoff, (IEEE TAC, accepted), with Huan Xu.
  25. Two-Stage Myopic Dynamics in Network Formation Games, (IEEE TAC, accepted), with Esteban Arcaute and Ramesh Johari.
  26. Percentile Optimization for Markov Decision Processes with Parameter Uncertainty, (Operations Research, accepted) with Erick Delage.



Preprints

Papers that appear without a link are currently under revision. Please email me if you want a copy.

  1. On Robustness/Performance Tradeoffs in Linear Programming and Markov Decision Processes, with Huan Xu.
  2. Network Formation: Bilateral Contracting and Myopic Dynamics, with Esteban Arcaute and Ramesh Johari.
  3. Gaussian Processes for Online Least Squares Regression and Temporal Difference Learning, with Y. Engel and R. Meir.
  4. Online learning in Markov Decision Processes, with J. Yu and N. Shimkin.
  5. Fully-Parallel Stochastic LDPC Decoders, with S. S. Tehrani and W. J. Gross.
  6. Robustness, Risk, and Regularization in Support Vector Machines, with Xu Huan and Constantine Caramanis.



Peer reviewed conference papers

  1. Weak Learners and Improved Convergence Rate in Boosting (NIPS 2000, pages 280-286), with Ron Meir.

    2001

  2. Learning Embedded Maps of Markov Processes (ICML 2001, pages 138-145), with Yaakov Engel.
  3. Adaptive Strategies and Regret Minimization in arbitrarily varying Markov Environments (COLT 2001, pages 128-142), with Nahum Shimkin.
  4. Geometric Bounds for Generalization in Boosting (COLT 2001, pages 461-472), with Ron Meir.
  5. The Steering Approach for Multi-Criteria Reinforcement Learning (NIPS 2001, pages 1563-1570), with Nahum Shimkin.

    2002

  6. The consistency of Greedy Algorithms for Classification (COLT 2002, pages 319-333), with Ron Meir and Tong Zhang.
  7. PAC bounds for Multi-armed bandit and Markov Decision Processes (COLT 2002, pages 255-270), with Eyal Even-Dar and Yishay Mansour.
  8. Q-Cut - Dynamic Discovery of Sub-Goals in Reinforcement Learning (ECML 2002, pages 295-306), with Ishai Menache and Nahum Shimkin.
  9. Sparse Online Greedy Support Vector Regression (ECML 2002, pages 84-96), with Yaakov Engel and Ron Meir.

    2003

  10. On-line Learning with Imperfect Monitoring (COLT 2003, pages 552-566), with Nahum Shimkin.
  11. Lower Bounds on the Sample Complexity of Exploration in the Multi-Armed Bandit Problem (COLT 2003, pages 418-432), with John Tsitsiklis.
  12. Action Elimination and Stopping Conditions for Reinforcement Learning (ICML 2003, pages 162-169), with Eyal Even-Dar and Yishay Mansour.
  13. The Cross Entropy method for Fast Policy Search (ICML 2003, pages 512-519), with Reuven Rubinstein and Yohai Gat.
  14. Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning (ICML 2003, pages 154-161), with Yaakov Engel and Ron Meir. Winner of best student paper award.

    2004

  15. An Inequality for Nearly Log-concave Distributions with Applications to Learning (COLT 2004, pages 534-548), with Constantine Caramanis.
  16. Reinforcement Learning for Average Reward Zero-Sum Games (COLT 2004, pages 49-63).
  17. Dynamic Abstraction in Reinforcement Learning via Clustering (ICML 2004), with Ishai Menache, Amit Hoze, and Uri Klein.
  18. Bias and Variance in Value Function Estimation (ICML 2004), with Duncan Simester, Peng Sun, and John N. Tsitsiklis.

    2005

  19. Probabilistic Optimization for Energy-Efficient Broadcast in All-Wireless Networks (CISS 2005), with Fulu Li and Andrew Lippman.
  20. The cross entropy method for classification (ICML 2005), with Dori Peleg and Reuven Rubinstein.
  21. Reinforcement learning with Gaussian Processes (ICML 2005), with Yaki Engel and Ron Meir.

    2006

  22. Asymptotics of Efficiency Loss in Competitive Market Mechanisms (INFOCOM 2006), with Jia Yuan Yu.
  23. Online Learning with Constraints (COLT 2006), with John N. Tsitsiklis.
  24. Online Learning with Variable Stage Duration (COLT 2006), with Nahum Shimkin.
  25. Automatic Basis Function Construction for Approximate Dynamic Programming and Reinforcement Learning (ICML 2006), with Philipp W. Keller and Doina Precup.
  26. Trade-off of Performance and Robustness in Markov Decision Process (NIPS 2006), with Xu Huan.

    2007

  27. Strategies for prediction under imperfect monitoring (COLT 2007), with Gabor Lugosi and Gilles Stoltz.
  28. Learning-based Load Shared Sequential Routing (IFIP/TC6 Networking 2007), with Fariba Heidari and Lorne G. Mason.
  29. Adaptive Timeout Policies for Fast Fine-Grained Power Management (AAAI 2007), with B. Kveton, P. Gandhi, G. Theocharous, B. Rosario, and N. Shah.
  30. Percentile Optimization in Uncertain Markov Decision Processes with Application to Efficient Exploration(ICML 07), with Erick Delage.
  31. Survey of Stochastic Computation on Factor Graphs (IEEE International Symposia on Multiple-Valued Logic), with S. Sharifi Tehrani and W. Gross.
  32. Dynamics and Stability in Network Formation Games with Bilateral Contracts (CDC 2007), with Esteban Aracute, Eric Dallal and Ramesh Johari.
  33. A Kalman Filter Design Based on the Performance/Robustness Tradeoff (Allerton 2007) with Huan Xu.
  34. Non-Cooperative Design of Translucent Networks (GLOBECOM 2007) with Benooit Chatelain, Francois Gagnon and David V. Plant.
  35. Network Formation: Bilateral Contracting and Myopic Dynamics (WINE 2007) with E. Arcaute and R. Johari.
  36. 2008

  37. A Lazy Approach to Online Learning with Constraints (10th International Symposium on Artificial Intelligence and Mathematics, 2008), with B. Kveton, J. Y. Yu, G. Theocharous.
  38. Identification in Market-Based Multi-Robot Coordination (IEEE DHMS), with A. Danak.
  39. Learning in the Limit with Adversarial Disturbances (COLT 2008), with C. Caramanis.
  40. Reinforcement learning in the presence of rare events (ICML 2008), with J. Frank and D. Precup.
  41. Online Learning with Expert Advice and Finite-Horizon Constraints (AAAI 2008), with B. Kveton, J. Y. Yu, G. Theocharous.


Ph.D. Thesis

Reinforcement Learning and Adaptation in Competitive Environments, May 2002 (Technion).


Miscellaneous

  1. On Universal Compression of Multidimensional Data Arrays Using Self Similar Curves (38 Allerton Conference on Communications, Control and Computing, 2000), with Yitzhak Weissman.
  2. Generalized Approachability Results for Stochastic Games with a Single Reachable State (appeared in ORP3, also listed as Technical Report EE-1263, The Technion, November 2000), with Nahum Shimkin.
  3. Regret Minimization in Signal Space for Repeated Matrix Games with Partial Observation (Technical Report EE-1242, The Technion, April 2000), with Nahum Shimkin.
  4. On Finding Good State Aggregation Functions (Workshop of the International Conference on Machine Learning, 2001), with Yaakov Engel.
  5. Reinforcement learning with kernels and Gaussian processes (Workshop of the International Conference on Machine Learning, 2005), with Yaakov Engel and Ron Meir.