Scaling Transformer to 1M tokens and beyond with RMT
abs: https://arxiv.org/abs/2304.11062
github: https://github.com/booydar/t5-experiments/tree/scaling-report
Fundamental Limitations of Alignment in Large Language Models
CLaMP: Contrastive Language-Music Pre-training for Cross-Modal Symbolic Music Information Retrieval
Factored Neural Representation for Scene Understanding
Emergent and Predictable Memorization in Large Language Models