Don’t chicken/egg this. All of the training data was man-made at some point. Until the first LLMs started outputting based on it.
Secondly, the amount of human-produced content and LLM-produced content that’s in the training data is incomparable. And will continue to be so. Otherwise the models break.
You probably live in a different world than I do.
Don’t chicken/egg this. All of the training data was man-made at some point. Until the first LLMs started outputting based on it.
Secondly, the amount of human-produced content and LLM-produced content that’s in the training data is incomparable. And will continue to be so. Otherwise the models break.