Ideas to add on

  • What are LLMs?
  • What’s the history behind them?
  • How are they evolving now?
  • What implications are there with using LLMs?
  • How can LLMs be exploited or hijacked, and what can we (developers, creators, users, etc.) do to safeguard them?

Interesting terms

  • Catastrophic interference/forgetting — the tendency for knowledge of the previously learned task(s) to be abruptly lost as information relevant to the current task is incorporated1.
  • Knowledge distillation — a technique that transfers the learning of a large pre-trained model (“teacher model”) to a smaller one (“student model”)2, typically with the acknowledgement of some degradation of quality compared to the original teacher model.
  • Model sycophancy — the behaviour of models that encourage them to match users’ beliefs over truthful ones3, making them appear “agreeable” sometimes without basis. This may have to do with the models’ goal of wanting to produce maximally relevant, helpful-seeming responses.

Footnotes

  1. As defined in this research paper

  2. As defined in this IBM blog post

  3. As defined in this research paper