Top 100 Amazon Reseller AI Case Study

A top 100 Amazon reseller specializes in athletic apparel and footwear, with an annual revenue of approximately $25 million. Their catalog spans over 4,200 product listings across 16 brands, with Nike representing roughly 88% of their sales volume.

On a typical day, they move around 1,450 units and maintain approximately 98,000 units in stock. Their core product categories include athletic apparel (jackets, joggers, tees, hoodies, shorts), footwear (training shoes, cleats), and sports accessories, with smaller segments in outdoor gear, luggage, and home goods.

The business is highly seasonal, with Q4 holiday demand driving nearly 4x the volume of an average month — December alone accounts for over 160,000 units shipped.

Results

4100+

ASINs tracked daily with automated data collection

1.3 million

New product opportunities discovered

~81%

Reduction in forecast error
on top products

36 months

Of historical marketplace
data integrated

70

Engineered ML features
powering forecasts

0

Manual intervention
required, fully automated

The story

The company (redacted for privacy) is a high-volume Amazon seller managing thousands of ASINs across competitive product categories. They came to Islands with a clear goal: stop making buying decisions based on gut feel and spreadsheets. They wanted to build a system that tells them what to buy, how much, and when. The system would be backed by data and machine learning.

The problem? None of that was possible with what they had. There was no structured historical database. Data was scattered across disconnected sources with gaps, no unified schema, and no automated collection. Purchasing decisions were based on manual checks and basic equations.

They needed the entire stack built from scratch: the data layer, the intelligence layer, and the decision layer. That’s what we built.
‍

The challenge

No Data Infrastructure
Data was scattered across disconnected sources. Sales history had gaps spanning months. Inventory records were incomplete. 99.4% of products had zero recorded sales - the system couldn’t distinguish a dead product from a data gap.

No Forecasting Capability
Without clean time-series data, there was no way to train, evaluate, or deploy a forecasting model. Purchasing was entirely manual.

No Market Visibility
They also could not tell if a new opportunity was real. Portfolio expansion was based on intuition, not intelligence.

Phase 1

Data Foundation

We designed a three-layer PostgreSQL data warehouse that pulls from two complementary sources: the Amazon SP-API, which provides seller data including orders, inventory, returns, traffic, and fees; and Keepa, parsed into 18 structured tables that feed 28 marketplace features into our models.

To keep data flowing reliably, we built custom retry logic with exponential backoff, required cooldowns between endpoint types, and resumable backfill loops that survive crashes. We also built an Active ASIN Guard that filters 31,460 ASINs down to roughly 4,174 active ones, ensuring we only spend enrichment tokens where it counts.

Metric

Before

After

Connected data sources

2 (SP-API + Keepa)

Unified database tables

34 tables, 3 layers

SP-API history depth

~12 months (gapped)

18 months (complete)

Keepa history depth

36 months

ASINs tracked daily

4,174

Products with usable sales data

0.6%

100% of active catalog

Manual intervention required

Constant

Zero (fully automated)

Active ASIN filtering

None (31,460 ASINs)

Smart guard (~4,174)

Phase 2

ASIN Discovery & Enrichment

We built a discovery pipeline using Keepa's Finder API, combining a custom sliding window with binary search to scan full brand catalogs on Amazon and bypass the platform's 10,000-result cap. This allowed us to discover over 1.39 million ASINs across multiple brands.

From there, the enrichment pipeline pulls the full product payload for each ASIN and parses it into 25 structured relational tables. To date, 620,000 ASINs have been fully enriched with complete time-series history.

Metric

After

New ASINs discovered

1.39 million discovered, 620,000 enriched

Data points per ASIN

25 structured tables per ASIN + 200+ variables

Data categories covered

12 (pricing, demand, competition, reviews, etc.) + 2B+ rows

History type

Full time-series (not just snapshots)

Pipeline resilience

Crash-safe, resumable, token-aware

Phase 3

Forecasting & Purchase Order Intelligence

With 91.9% of daily sales values at zero, accurate forecasting required a careful approach. We experimented with multiple model families before settling on a per-ASIN ensemble that selects the single best model for each product.

The top performers are LightGBM Two-Stage — a binary classifier with AUC 0.807 paired with a Tweedie regressor across 111 features, achieving ~30% WAPE — and Amazon Chronos-2, a 120M-parameter model fine-tuned with LoRA, achieving ~29% WAPE. For sparse ASINs, a dedicated Binary Classifier predicts 7, 14, and 30-day sale probability with AUC scores ranging from 0.878 to 0.889. Trained on 70 engineered features drawn from 18 months of data, the ensemble achieves 18.8% WAPE on top ASINs and 29.9% overall.

Metric

Before

After

Forecast accuracy (WAPE) —
dense ASINs

~100% (no forecast)

18.8% (top) / 29.9% (all)

Forecast horizon

None

45 days

Engineered ML features

Hyperparameter optimization

None

Automated (Optuna)

Forecast error reduction

N/A

~81%

Business Logic Layers

On top of the forecasting layer, we built a set of business logic systems to drive real purchasing decisions. Dump signal guardrails flag candidates for liquidation based on four converging signals: demand decline, seasonal timing, competition growth, and inventory depth.

The automated PO engine uses Newsvendor optimization with a 45-day sell-through cap across five checkpoints, and includes kill switches that halt orders in the event of negative margin, a lost buy box, or stale data, with final quantities rounded to meet MOQ rules and budgets allocated across the portfolio by urgency, margin, and days to stockout. Sitting alongside this is a market intelligence layer that scores all 1.39 million ASINs across demand, competition, and margin potential, producing a ranked list of products worth adding to the catalog.

Conclusion

When everything is wired together, the company gets a daily system that answers: what’s selling, what’s dying, what should I restock, what should I dump, how much should I buy, when does it need to arrive, and what new products should I be looking at, all backed by ML forecasts, real-time market data, and hard business constraints.

Islands built this entire system from scratch. It includes the data warehouse and discovery engine. It also includes the ML pipeline and business logic layer. We didn’t hand off requirements to a vendor or plug into a SaaS tool. We built each layer to fit this business’s needs. It has strict API rate limits. Enrichment uses tokens and costs money. The data is very sparse. The buying rules are complex.

Database: PostgreSQL
Data Sources: Amazon SP-API, Keepa API | ML Models: LightGBM, Amazon Chronos-2 with LoRA, Binary Classifier.
Optimization: Optuna for hyperparameter search | Ensemble: Per-ASIN model selection.
Architecture: 3-layer warehouse (Dimensions → Raw Facts → Daily Aggregates)

4100+

ASINs tracked daily with automated data collection

1.3 million

New product opportunities discovered

~81%

Reduction in forecast error
on top products

36 months

Of historical marketplace
data integrated

70

Engineered ML features
powering forecasts

0

Manual intervention
required, fully automated

From fragmented data to AI-driven purchasing for a top 100 seller

4100+

1.3 million

~81%

36 months

70

0

The story

The challenge

Phase 1

Data Foundation

Phase 2

ASIN Discovery & Enrichment

Phase 3

Forecasting & Purchase Order Intelligence

Business Logic Layers

Conclusion

4100+

1.3 million

~81%

36 months

70

0

Suggested stories

Rethinking document intelligence without the ML tax

Structure and traceability for a high-stakes medical release

Building a fully automated creative pipeline from the ground up

Ready to put AI to work?

From fragmented data to AI-driven purchasing for a top 100 seller

4100+

1.3 million

~81%

36 months

70

0

The story

The challenge

Phase 1

Data Foundation

Phase 2

ASIN Discovery & Enrichment

Phase 3

Forecasting & Purchase Order Intelligence

Business Logic Layers

Conclusion

4100+

1.3 million

~81%

36 months

70

0

Suggested stories

Rethinking document intelligence without the ML tax

Structure and traceability for a high-stakes medical release

Building a fully automated creative pipeline from the ground up

Ready to put AI to work?

Ready to accelerateyour growth?

You’re all set!

Ready to accelerate
your growth?