Agentic AI email costs between $22 and $130 per month in raw model inference, depending on which state-of-the-art model you run. Apply a standard 75% gross margin and add hosting, and a software company prices that product at roughly $500 per year list. Google Workspace Enterprise runs $11 to $18 per month. A fully agentic email solution costs approximately twice that.

Smaller models cut inference cost by 10x to 20x. Local inference on a user's own GPU drops it to zero. The real optimization play for the next 12 to 24 months is matching model to workload: deterministic rules handle filters, smaller models handle classification, frontier models handle only what requires them. That layered approach can reduce total cost by 100x.

The full piece by Tomasz Tunguz is worth reading for the cost breakdown chart across specific models and the argument that GPU scarcity makes inference market segmentation not optional but inevitable. The numbers are rough, but the framing is the point.

[READ ORIGINAL →]