# Clip > CLIP is a neural network trained on diverse (image, text) pairs to learn visual concepts from natural language supervision. It can be applied to any visual classification benchmark by simply providing ## Pages - [CLIP (Contrastive Language-Image Pre-Training) Overview](overview.md): CLIP is a neural network trained on diverse (image, text) pairs to learn visual concepts from natural language superv... - [CLIP](readme.md): CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pairs. It ca... - [Model Card: CLIP](model-card.md): Inspired by [Model Cards for Model Reporting (Mitchell et al.)](https://arxiv.org/abs/1810.03993) and [Lessons from A... - [Interacting with CLIP](notebook-interacting-with-clip.md): This is a self-contained notebook that shows how to download and run CLIP models, calculate the similarity between ar... - [Preparation for Colab](notebook-prompt-engineering-for-imagenet.md): Make sure you're running a GPU runtime; if not, select "GPU" as the hardware accelerator in Runtime > Change Runtime ...