############################################################################## Multimodal ############################################################################## ****************************************************************************************** Diffusion ****************************************************************************************** .. important:: * [calvinyluo.com] `Understanding Diffusion Models: A Unified Perspective `_ * [arxiv] `Tutorial on Diffusion Models for Imaging and Vision `_ * `Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning `_ ****************************************************************************************** Tech ****************************************************************************************** * [anthropic.com] * `Anthropic Research `_ * `Towards Monosemanticity: Decomposing Language Models With Dictionary Learning `_ * `Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet `_ * `Towards Understanding Sycophancy in Language Models `_ * `Specific versus General Principles for Constitutional AI `_ * `Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training `_ * `Simple probes can catch sleeper agents `_ * `Challenges in evaluating AI systems `_ * `AI Governance and Accountability: An Analysis of Anthropic's Claude `_ * `Claude `_ (`Claude on Bedrock `_) * [mistral.ai] `Mixtral `_ (`Mixtral on Bedrock `_) * [ai.meta.com `Llama `_ (`Llama on Bedrock `_) * [stability.ai] `Stable Diffusion `_ (`Stable Diffusion on Bedrock `_) * [blog.google] `Gemini `_ * [openai.com] * `GPT-4o `_ * [DALL·E]: `Creating images from text `_ * [SoRa]: `Video generation models as world simulators `_ * `DALL·E 2 `_ ****************************************************************************************** Resources ****************************************************************************************** Industry ========================================================================================== * Amazon Science: `ML `_, `GenAI `_ * `A quick guide to Amazon’s papers at ICML 2024 `_ * `List of publications `_ * `Conversational AI `_ * `Large language models (LLMs) `_ * `Computer vision `_ * `Code and datasets `_ * `AWS News Blog `_: * `Category: Amazon Machine Learning `_ * `Category: Generative AI `_ * Pinterest * `PinnerSage: Multi-Modal User Embedding Framework for Recommendations at Pinterest `_ * [research.google] `Transformers in music recommendation `_ Misc ========================================================================================== * Yann LeCun * `Google Scholar Page `_ * `Arxiv Page `_ * [neptune.ai] `6 GAN Architectures You Really Should Know `_ Paper List ========================================================================================== .. csv-table:: :header: "Year","Paper" :align: center 2023,Align your latents: High-resolution video synthesis with latent diffusion models 2023,Photorealistic video generation with diffusion models 2023,Patch n'Pack: NaViT a Vision Transformer for any Aspect Ratio and Resolution 2023,Scalable diffusion models with transformers 2023,Improving image generation with better captions 2022,Generating long videos of dynamic scenes 2022,NUwa: Visual synthesis pre-training for neural visual world creation 2022,Imagen video: High definition video generation with diffusion models 2022,Masked autoencoders are scalable vision learners 2022,High-resolution image synthesis with latent diffusion models 2022,Elucidating the design space of diffusion-based generative models 2022,Scaling autoregressive models for content-rich text-to-image generation 2022,Hierarchical text-conditional image generation with clip latents 2021,Videogpt: Video generation using vq-vae and transformers 2021,Vivit: A video vision transformer 2021,Improved denoising diffusion probabilistic models 2021,Diffusion Models Beat GANs on Image Synthesis 2021,Zero-shot text-to-image generation 2021,Sdedit: Guided image synthesis and editing with stochastic differential equations 2020,Language models are few-shot learners 2020,An image is worth 16x16 words: Transformers for image recognition at scale 2020,Denoising diffusion probabilistic models 2020,Generative pretraining from pixels 2019,Adversarial video generation on complex datasets 2018,World models 2018,Mocogan: Decomposing motion and content for video generation 2017,Recurrent environment simulators 2017,Attention is all you need 2016,Generating videos with scene dynamics 2015,Unsupervised learning of video representations using lstms 2015,Deep unsupervised learning using nonequilibrium thermodynamics 2013,Auto-encoding variational bayes