Interpretability

Blog tagged as Interpretability

A new study has shown that transformers can be expressed in a simple logic formalism. This finding challenges the perception that transformers are inscrutable black boxes and suggests avenues for interpreting how they work.
Ines Almeida
13.08.23 07:50 PM - Comment(s)
A new study has shown that transformers can be expressed in a simple logic formalism. This finding challenges the perception that transformers are inscrutable black boxes and suggests avenues for interpreting how they work.
Ines Almeida
13.08.23 07:50 PM - Comment(s)
DisentQA: Catching Knowledge Gaps and Avoiding Misleading Users
Building QA Systems that catch knowledge gaps and avoid misleading users.
Ines Almeida
12.08.23 09:22 AM - Comment(s)
Peeking Inside the Black Box: Uncovering What AI Models Know About Books
New research from the University of California, Berkeley sheds light on one slice of these models' knowledge: which books they have "read" and memorized. The study uncovers systematic biases in what texts AI systems know most about.
Ines Almeida
10.08.23 08:03 AM - Comment(s)
The Future of AI Language Models: Making Them More Interpretable and Controllable
Backpack models have an internal structure that is more interpretable and controllable compared to existing models like BERT and GPT-3.
Ines Almeida
10.08.23 07:59 AM - Comment(s)