

ok first dumb question, is the block of code that you had below this line
Given a query of “write a json schema to represent a comment thread on a social media site like reddit”, it’ll do this bit of reasoning:
Was this an actual output from an LLM or a hypothetical example that you wrote up? It’s not quite clear to me. It’s a lot of output but I don’t want to insult you if you wrote all that yourself
First, each comment has an ID, author info, content, timestamps, votes, and replies. The replies are nested comments, forming a tree structure. So the schema should allow for nested objects within each comment’s replies.
I ask because I really want to nitpick the hell out of this design decision:
First, each comment has an ID, author info, content, timestamps, votes, and replies. The replies are nested comments, forming a tree structure. So the schema should allow for nested objects within each comment’s replies.
Adding the replies as full items, that is going to absolutely murder performance. A better scheme would be for replies to be a list/array of IDs or URLs, or a URL to an API call that enumerates all the replies, instead of enumerating all the items and embedding them directly. That is going to absolutely kill performance. Depending on the implementation, you could easily be doing the classic N+1
query that a lot of web applications fall for.
But then again at this point I’m arguing with an LLM which is generating absolutely dogshit code.
very true