It's impossible to detect whether some statement in isolation is a hallucination or not, with LLMs.
It's better for it to aggregate the information and then provide the resources to be able to verify whether any deduction is well supported or not.
I guess with a few more iterations you could have another agent verify whether a deduction is well justified, but that will also have some significant percentage of errors too.
It's better for it to aggregate the information and then provide the resources to be able to verify whether any deduction is well supported or not.
I guess with a few more iterations you could have another agent verify whether a deduction is well justified, but that will also have some significant percentage of errors too.