AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
paper: https://arxiv.org/abs/2304.12995
DeepFloyd IF, a novel state-of-the-art open-source text-to-image model with a high degree of photorealism and language understanding
github: https://github.com/deep-floyd/IF
Track Anything: Segment Anything Meets Videos
github: https://github.com/gaomingqi/Track-Anything
demo: https://huggingface.co/spaces/watchtowerss/Track-AnythingSEEM: Segment Everything Everywhere All at Once
github: https://github.com/UX-Decoder/Segment-Everything-Everywhere-All-At-Once
LLaMA-Adapter: Efficient Fine-tuning of LLaMA
On fire. I’m loving these daily updates