I wish someone would elaborate on what they were doing and observed since Jan on opus 4.6. I’ve been using it with 1m context on max thinking since it was released - as a software engineer to write most of my code, code reviews + research and explain unfamiliar code - and haven’t notice a degradation. I’ve seen this mentioned a lot though.
I have seen that codex -latest highest effort - will find some important edge cases that opus 4.6 overlooked when I ask both of them to review my PRs.
I don't use it for coding, but I do use it for real world tasks like general assistant.
I did notice multiple times context rot even in pretty short convos, it trying to overachie and do everything before even asking for my input and forgetting basic instructions (For example I have to "always default to military slang" in my prompt, and it's been forgetting it often, even though it worked fine before)
But degrading a model right before a new release is not the way to go.