Exploring Snowflake DocumentAI for Supermarket Price Tag Recognition

Introduction

As a detective of data and automation, I embarked on an exciting journey to train a model using Snowflake DocumentAI to identify product names and prices on supermarket price tags. This endeavor was driven by the need for precise extraction of relevant information from price tags, which could be useful for price tracking, competitor analysis, or automated checkout systems.

Data Collection

My journey began with a visit to a supermarket, where I captured unique pictures of price tags. To diversify the dataset, I supplemented my images with additional samples downloaded from Google. The goal was to provide a robust dataset that could help train the model to accurately identify both product names and prices.

Schema Creation

Before diving into model training, I created a schema to define the data I wanted to extract—product name and product price. This step ensured a structured approach to processing and storing recognized information.

Storage Setup

I created a new stage, which serves as storage for the documents. Snowflake allows integration with external storage like an S3 bucket or similar services from other cloud providers. Additionally, it provides native Snowflake storage. However, I encountered some user-experience challenges with Snowflake’s native storage, as it was not as intuitive as other cloud storage solutions. Occasionally, the interface displayed an “internal error,” making the process less smooth than expected.

Initial Model Performance

I uploaded an image and asked the model to extract the product name and price. The first results were far from perfect—the model often identified text such as “Promotion” or other unrelated information as the product name. Additionally, it struggled to pinpoint the correct price, sometimes confusing it with promotional offers or other numbers on the tag.

Refining the Input Data

To improve accuracy, I decided to simplify the input. Instead of using full images containing various elements, I extracted only the price tags. By leveraging libraries that detect sharp borders and colors, I was able to isolate price tags from the rest of the image. This preprocessing step significantly reduced noise in the input data.

Model Improvement

After training the model with 20 refined images, it reached an 80% accuracy rate in identifying product names. However, it still faced challenges with prices, often confusing discounted prices with regular ones or incorrectly identifying the lowest price in the past 30 days.

By adding another 15 images to the training set, the model improved further. At this stage, it correctly identified product names and prices in 8 out of 10 images, showing promising results.

Target and Desired Flow

Looking ahead, an ideal solution would involve full automation:

When new data/images appear in storage, a request is sent to a queue.
Snowflake monitors the queue and triggers a query (this can also be triggered from a custom Python script).
Once the data is processed, the results are stored in a table along with timestamps.

This approach could fully automate the transformation of documents into structured table data. However, I believe this method is more suitable for structured data like invoices. Price tags, on the other hand, are highly dynamic, and the default model struggled to achieve 100% accuracy. In contrast, AI models like ChatGPT or Google Gemini had no trouble accurately identifying product names and prices in any of the test images.

Example Query:

// To process one image

SELECT PRICES.PUBLIC.TRACK_PRICES!PREDICT(
  GET_PRESIGNED_URL(@<stage_name>, '<relative_file_path>'), 4);

// to process multiple images: 

SELECT PRICES.PUBLIC.TRACK_PRICES!PREDICT(
  GET_PRESIGNED_URL(@<stage_name>, RELATIVE_PATH), 4)
FROM DIRECTORY(@<stage_name>);

This query demonstrates how to predict extracted price data using the trained model in Snowflake.

While this project didn’t reach production, it was a fascinating exploration of Snowflake DocumentAI’s capabilities and limitations in text recognition within images.

Future Exploration

I plan to explore how we can bring our own model and integrate it with DocumentAI. This could offer greater flexibility and improve accuracy when dealing with less-structured documents such as price tags.