Follow
Thomas Coste
Thomas Coste
Noah's Ark Lab & University of Cambridge
Verified email at cam.ac.uk
Title
Cited by
Cited by
Year
Reward Model Ensembles Help Mitigate Overoptimization
T Coste, U Anwar, R Kirk, D Krueger
Twelfth International Conference on Learning Representations, 2023
472023
Pangu-agent: A fine-tunable generalist agent with structured reasoning
F Christianos, G Papoudakis, M Zimmer, T Coste, Z Wu, J Chen, ...
arXiv preprint arXiv:2312.14878, 2023
92023
Bayesian Reward Models for LLM Alignment
AX Yang, M Robeyns, T Coste, J Wang, H Bou-Ammar, L Aitchison
ICLR 2024 Workshop on Secure and Trustworthy Large Language Models, 2024
72024
The system can't perform the operation now. Try again later.
Articles 1–3