Apple releases MGIE, an AI model that understands text to edit images | Technology News
Apple has unveiled an exciting new AI model called MGIE that allows users to edit images simply by providing natural language instructions. MGIE, short for MLLM-Guided Image Editing, leverages large language models to interpret text prompts and make pixel-level changes to photos. This new open-source tool represents a major advance in multimodal AI and could significantly enhance creative workflows.
MGIE is the fruit of collaboration between Apple and researchers at UC Santa Barbara. The model was presented in a paper at this year’s International Conference on Learning Representations, a premier venue for showcasing cutting-edge AI systems. Experiments described in the paper demonstrate MGIE’s impressive performance on improving image editing metrics and human evaluations. The system also maintains competitive computational efficiency.
MGIE’s versatile design empowers all kinds of image editing use cases. It can handle common Photoshop adjustments like cropping, rotating, and filtering. The model also performs more advanced object manipulations, background replacement, and photo blending. MGIE optimizes images globally by adjusting properties like brightness and contrast. It likewise performs localized edits on specific regions and objects. The system can modify visual attributes including shape, size, color, texture, and style.
MGIE isn’t accessible through an app or website the way ChatGPT is. But if you are a developer, getting started with MGIE is pretty straightforward. The code, data, and pretrained models are available in an open-source GitHub repo. The project includes a demo notebook to illustrate how MGIE enables various edits.
Additionally, users can also access a live web demo on Hugging Face Spaces to experiment with the model. MGIE accepts natural language instructions and outputs edited images along with the derived editing steps. Users can provide feedback to iteratively improve results. The flexible API makes MGIE easy to integrate into other applications needing image manipulation functionality.
MGIE represents an exciting leap forward for instruction-based image editing. It demonstrates the potential of using MLLMs to enhance image editing and opens up new possibilities for cross-modal interaction and communication.
