# LLM Development Guide ## Code Style Guidelines - **Package Management**: ALWAYS use `uv` for all Python package management (install, run, sync, etc.). Never use pip, poetry, or other package managers. - **Imports**: Group standard library, third-party, and local imports with a blank line between groups - **Formatting**: Use Ruff with 88 character line length - **Types**: Use type annotations everywhere; import types from typing module - **Naming**: Use snake_case for variables/functions, PascalCase for classes, UPPER_CASE for constants - **Error Handling**: Use specific exceptions with meaningful error messages - **Documentation**: Use docstrings for all public functions, classes, and methods - **Logging**: Use the structured logging module; avoid print statements - **Async**: Use async/await for non-blocking operations, especially in FastAPI endpoints - **Configuration**: Use environment variables with YAML for configuration - **Requirements**: Use the most up-to-date versions of dependencies unless specifically instructed not to ## Process Guide Always start by taking a look at the project. Any .md file in the top level directory is important context. Generally, we will use `PLAN.md` as our "current state" scratchpad. We'll keep notes for ourselves there that include any context we'll need for future steps, including: * Any design principles or core ideas we landed on that might be relevant later on * A summary of where we are in our overall development cycle * What we imagine our next steps will be In total, PLAN.md should never exceed ~100 lines or so. A human being should be able to read & understand it in under 5 minutes. ### Updating PLAN.md * If we need to update a PLAN, it often makes sense to ask the user clarifying questions before starting. Explain how you intend to update the PLAN document before actually doing so. Ask the user for feedback! * PLANs should be as test-driven as possible. We will focus on Component/Acceptance tests, at the interface boundary between components. No unit tests! And we'll generally use the Red-Green-Refactor method of Test-Driven-Development. More on this in the next section. * Sometimes a goal within a PLAN cannot be encapsulated within an automated test. In those cases, we'll still want to have some kind of validation step after a unit of work. This may be running a command and seeing that the result is as expected. Or asking the user to confirm that a frontend looks the way it should. Etc. * A given goal within a plan should be a logical unit of work that would take a senior developer 4-8 hours to implement. * The final output of a goal in our PLAN should typically be under a thousand lines of code and make sense as a single PR. * Always stop and ask clarifying questions after implementing `PLAN.md`. You should not move on to implementing the plan until you have gotten an explicit go-ahead instrcution from the user. ### Testing Coming up with a good test is **always the first step** of any plan. Tests should validate that the behaviour of a module is correct, by interacting with its interface. If the module's internals change, the tests should still pass! Tests should generally follow this pattern (pseudocode): self.set_up() example_input = {'example': 'input'} self.assert_property(self.get('/v1/api/endpoint', example_input)) Not every interface will be a REST API, and not every test will be directly measurable property of the output like this. But in all cases, the test should include a setup, a call to an interface, and a test of some assumption we will make about calling that interface. We should NEVER USE MOCKS. Interfaces should be as pure-functional as possible. Use dependency injection. If a function has a side-effect, then *the side-effect should be in the name of the function, and its main purpose* if at all possible. **At the beginning of any new goal, ask the user questions until you understand what tests you should write.** Then write the tests and ask the user if they match their expectations. ### Code style **CRITICAL RULES - NEVER VIOLATE THESE:** * **KISS (Keep It Simple, Stupid)**: Always choose the simplest solution that solves the problem. Avoid clever tricks, complex abstractions, and over-engineering. * **Readable & Production-Ready**: All code must be immediately readable by any developer and production-ready. No shortcuts, no "TODO" comments, no placeholder implementations. * **NO BOILERPLATE**: Do not write any boilerplate code. Do not create classes, functions, or structures "for future use." Write only what is needed right now for the current goal. * **NO COMMENTS**: Code must be self-documenting through clear naming and simple structure. Comments are forbidden (except docstrings for public APIs). If you think you need a comment, refactor the code to be clearer instead. **Additional Guidelines:** Write the minimum code needed to accomplish a task. Do not build for untested contingencies. If you believe there is an edgecase or contingency that needs to be accomodated in your code, that case should be explicitly tested. Functions should clearly identify their purpose through their names and signatures. Simple is better than complex, explicit is better than implicit, and boring code is better than clever code. No magic; don't abuse metaprogramming. When I read code its behaviour should be obvious to me. Tests should effectively serve as documentation. By reading the tests in a pull request, I should be able to infer the PR's primary purpose.