Running Ollama, Llava-Phi3 on a small AWS EC2 Instance for Image Analysis

Dhaval Nagar / CEO

Ollma is a great way to run models locally, better privacy, performance, and cost utilization. In this post we will are using a small EC2 instance to do simple image analysis with the custom Llava-Phi3 model.

Eventually, small language models will become ubiquitous and easy to integrate as part of regular applications - it will be efficient in terms of privacy, performance, and cost.

We processed a batch images captured at the recent AWS Summit 2024 Bengaluru, with a custom fine-tuned llava-phi3 model running on AWS EC2 t4g.large instance with just 8GB of RAM. Yes, this is not an ideal setup for production-grade applications but the model could still run on the lower configuration, without any GPU attached and still gave sufficiently useful output.

Llava-Phi3 Model

We used the recent Ollama llava-phi3 model for the analysis. Llava models are great for simple image analysis. We are planning to use Llava models for the pre-analysis for one of our pattern recognition use case.

Amazon EC2 Configuration

We decided to try the smallest possible instance for experimentation. Using Amazon Linux, t4g.large, 2vCPU and 8 GB RAM, we were able to run the Ollama, model, and were able to process the images. Images were stored in the S3.

imgEC2Config
This costs around $50 for a month

Due to the lowest possible configuration, it's very poor in terms of model performance. It takes a couple of minutes to process each image, given all the images are larger than 1024x1024 dimensions. At the time of writing this post, Ollama still does not support parallel request processing, but for this use case we are fine with sequential processing.

Output

These images are from one of the recent presentation that I gave at the AWS Summit event.

imgDemo1

...

imgDemo2

Summary

It's amazing the speed at which newer models are available and accessible on wide variety of infrastructures. This experimentation was just to understand if we can utilize smaller models on constrained-infrastructure for simple use cases or not.

A lot has changed in the past one year, and may be, a year from now, a lot will change and models will run along with our regular applications.

More articles

The Arrows, Not the Boxes: Systems Thinking for AWS Architects

You can configure an ALB flawlessly and explain IAM policies from memory. But can you articulate why an ALB is better than an NLB for your specific constraints? The different between "knowing AWS services" and "thinking in systems" is what separates engineers who implement from architects who design.

Read more

AWS CodeCommit Returns: What the Reversal Means for You

In an unprecedented reversal, AWS has restored CodeCommit to general availability after deprecating it in July 2024. This decision validates teams that remained committed to AWS-native git repositories while leaving migrated teams questioning their investment.

Read more

Tell us about your project

Our office

  • 425, Avadh Kontina
    Vip Road, Canal Road Corner, near CB patel club
    Surat, Gujarat 395007
    Google Map

© 2025 APPGAMBiT. All rights reserved.