The core challenge was changing unstructured, often physical product information into structured digital data. Items arrived in the warehouse with either printed specs, labels, or none at all. Manual entry of specs from these physical documents into the ERP was time-consuming and error-prone.
- OCR Accuracy: Extracting information from varying label formats (the labels, mostly engraved, were not super sharp at times or with very limited information) needed a robust OCR pipeline.
- Contextual Understanding: Specs needed to be interpreted, not just read, requiring a language model to understand terms like material type, measurements, etc. The system had to generate all the required information for the product so that it’s ready to be listed in the online store.
- ERP Integration: The final output had to be accurately mapped and sent to the ERP system with no/less manual intervention.