Neuron-level Interpretation of Deep NLP Models: A Survey

Sajjad, Hassan; Durrani, Nadir; Dalvi, Fahim

doi:10.1162/tacl_a_00519

tacl_a_00519.pdf (873.03 kB)

Neuron-level Interpretation of Deep NLP Models: A Survey

journal contribution

submitted on 2024-04-24, 05:12 and posted on 2024-04-24, 07:44 authored by Hassan Sajjad, Nadir Durrani, Fahim Dalvi

The proliferation of Deep Neural Networks in various domains has seen an increased need for interpretability of these models. Preliminary work done along this line, and papers that surveyed such, are focused on high-level representation analysis. However, a recent branch of work has concentrated on interpretability at a more granular level of analyzing neurons within these models. In this paper, we survey the work done on neuron analysis including: i) methods to discover and understand neurons in a network; ii) evaluation methods; iii) major findings including cross architectural comparisons that neuron analysis has unraveled; iv) applications of neuron probing such as: controlling the model, domain adaptation, and so forth; and v) a discussion on open issues and future research directions.

Other Information

Published in: Transactions of the Association for Computational Linguistics
License: https://creativecommons.org/licenses/by/4.0/
See article on publisher's website: https://dx.doi.org/10.1162/tacl_a_00519

History

Language

English

Publisher

MIT Press

Publication Year

2022

License statement

This Item is licensed under the Creative Commons Attribution 4.0 International License.

Institution affiliated with

Hamad Bin Khalifa University
Qatar Computing Research Institute - HBKU

Usage metrics

Keywords

Deep Neural Networks Interpretability High-level representation analysis Neuron analysis Evaluation methods

Licence

CC BY 4.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Neuron-level Interpretation of Deep NLP Models: A Survey

Other Information

History

Language

Publisher

Publication Year

License statement

Institution affiliated with

Usage metrics

Categories

Keywords

Licence

Exports