Page 3 of
3
From January - February the Alignment Research Center offered prizes for proposed algorithms for eliciting latent knowledge. In total we received 197 proposals and are awarding 32 prizes of $5k-20k. We are also giving 24 proposals honorable mentions of $1k, for a total of $274,000.
…
»
Thank you to all those who have submitted proposals to the ELK proposal competition. We evaluated 30 distinct proposals from 25 people. We awarded a total of $70,000 for proposals from 8 people.
…
»
Roughly speaking, the goal of ELK is to incentivize ML models to honestly answer “straightforward” questions where the right answer is unambiguous and known by the model. We are offering prizes of $5,000 to $50,000 for proposed strategies for ELK.
…
»
In this post I’ll describe some possible approaches to eliciting latent knowledge (ELK) not discussed in our report. These are basically restatements of proposals by Davidad, Rohin, Ramana, and John Maxwell. For each approach, I’ll present one or two counterexamples that I think would break it.
…
»
This is an archived version of the early 2022 hiring round page.
…
»
ARC has published a report on Eliciting Latent Knowledge, an open problem which we believe is central to alignment. We think reading this report is the clearest way to understand what problems we
…
»