Amazon bedrock randomly throttles you. Quinnypig has talked about this extensively and how many times they've rugpulled even those with enterprise support for their critical production systems.
It's a bad inference provider. Consider moving to Google or Claude itself.
I'm ok to pay for GPUs and currently trying vast.ai, I just need the firepower and I'll use it for my org's usage and the way I want to using opensource models.
I don't want to policing or get locked down or throttled based on my usage or volume.
One flat fee and pricing.
The reason is that I currently don't have the necessary infra available (though working towards acquiring GPUs), I want the dev to continue without any bottlenecks.
I was noticing token counts on opus 4.7 which seemed at least 2x what they should have been. I wonder if they are doing a fix ?
Insane for a company to pull this crap on paying customers with production workflows.
Amazon bedrock randomly throttles you. Quinnypig has talked about this extensively and how many times they've rugpulled even those with enterprise support for their critical production systems.
It's a bad inference provider. Consider moving to Google or Claude itself.
Any alternative to bedrock?
I'm ok to pay for GPUs and currently trying vast.ai, I just need the firepower and I'll use it for my org's usage and the way I want to using opensource models.
I don't want to policing or get locked down or throttled based on my usage or volume.
One flat fee and pricing.
The reason is that I currently don't have the necessary infra available (though working towards acquiring GPUs), I want the dev to continue without any bottlenecks.
AWS GPU offerings are distinct from Bedrock. This is specifically about AWS Bedrock and using an inference provider that gives access to Claude.
If you just want GPUs, just talk capacity with an account manager at your favorite hyperscaler.
They quietly changed Opus 4.7 quota to 0. I can lo longer make even a single request