Apple Introduces AI Image Tool: Edit with Just Descriptions

Spread the love

Researchers Showcase AI Model’s Proficiency in Generating Meaningful Image Enhancements.

Apple researchers have introduced a revolutionary AI model called MGIE (MLLM-Guided Image Editing), which allows users to describe desired changes to a photograph using simple language. This innovative tool eliminates the need for traditional photo editing software, making the editing process more accessible to a wider audience.

Developed through a collaborative effort between Apple and the University of California, Santa Barbara, the MGIE model leverages advanced machine learning techniques to interpret user descriptions and apply meaningful edits to images, marking a significant advancement in image editing technology.

MGIE, the MLLM-Guided Image Editing model, stands out as a remarkable advancement in image editing technology. With its capability to perform a wide range of editing tasks, including cropping, resizing, flipping, and applying filters, solely through text prompts, MGIE revolutionizes the image editing process. This innovative tool can handle both simple and complex editing requests with ease, from basic adjustments like resizing and flipping to more intricate tasks such as altering specific objects within a photo or enhancing brightness levels.

MGIE utilizes multimodal language models to interpret user prompts and generate corresponding image edits. For example, a request like “bluer sky” leads to adjusting the brightness of the sky in the image. This approach ensures accurate interpretation and execution of editing instructions, marking a significant advancement in image editing technology.

For instance, requesting to “make it more healthy” while editing an image of a pepperoni pizza would prompt the addition of vegetable toppings. Similarly, instructing the model to “add more contrast to simulate more light” enhances the brightness of a dark image, such as tigers in the Sahara.

In their statement, researchers underscored MGIE’s capacity to discern explicit visual intentions, resulting in impactful image enhancements. Extensive studies validated its effectiveness across diverse editing scenarios, highlighting improved performance without compromising efficiency. They anticipate the MLLM-guided framework to drive future innovations in vision-and-language research.

Apple released MGIE for download on GitHub and provided a web demo on Hugging Face Spaces. However, the company hasn’t disclosed future plans for the model beyond research purposes.

While platforms like OpenAI’s DALL-E 3 and Adobe’s Firefly AI model offer comparable capabilities, Apple’s entry into the generative AI domain underscores its dedication to integrating advanced AI features into its products. CEO Tim Cook has articulated the company’s commitment to expanding AI functionalities across its devices. Recent initiatives include the release of the open-source machine learning framework MLX in December, designed to facilitate AI model training on Apple Silicon chips.

Read more Tech News

Leave a Reply