86 lines
5.4 KiB
Markdown
86 lines
5.4 KiB
Markdown
# LLM Development Guide
|
|
|
|
## Code Style Guidelines
|
|
- **Package Management**: ALWAYS use `uv` for all Python package management (install, run, sync, etc.). Never use pip, poetry, or other package managers.
|
|
- **Imports**: Group standard library, third-party, and local imports with a blank line between groups
|
|
- **Formatting**: Use Ruff with 88 character line length
|
|
- **Types**: Use type annotations everywhere; import types from typing module
|
|
- **Naming**: Use snake_case for variables/functions, PascalCase for classes, UPPER_CASE for constants
|
|
- **Error Handling**: Use specific exceptions with meaningful error messages
|
|
- **Documentation**: Use docstrings for all public functions, classes, and methods
|
|
- **Logging**: Use the structured logging module; avoid print statements
|
|
- **Async**: Use async/await for non-blocking operations, especially in FastAPI endpoints
|
|
- **Configuration**: Use environment variables with YAML for configuration
|
|
- **Requirements**: Use the most up-to-date versions of dependencies unless specifically instructed not to
|
|
|
|
## Process Guide
|
|
|
|
Always start by taking a look at the project. Any .md file in the top level directory is important context.
|
|
|
|
Generally, we will use `PLAN.md` as our "current state" scratchpad. We'll keep notes for ourselves there that include any context we'll need for future steps, including:
|
|
|
|
* Any design principles or core ideas we landed on that might be relevant later on
|
|
* A summary of where we are in our overall development cycle
|
|
* What we imagine our next steps will be
|
|
|
|
In total, PLAN.md should never exceed ~100 lines or so. A human being should be able to read & understand it in under 5 minutes.
|
|
|
|
### Updating PLAN.md
|
|
|
|
* If we need to update a PLAN, it often makes sense to ask the user clarifying questions before starting. Explain how you intend to update the PLAN document before actually doing so. Ask the user for feedback!
|
|
|
|
* PLANs should be as test-driven as possible. We will focus on Component/Acceptance tests, at the interface boundary between components. No unit tests! And we'll generally use the Red-Green-Refactor method of Test-Driven-Development. More on this in the next section.
|
|
|
|
* Sometimes a goal within a PLAN cannot be encapsulated within an automated test. In those cases, we'll still want to have some kind of validation step after a unit of work. This may be running a command and seeing that the result is as expected. Or asking the user to confirm that a frontend looks the way it should. Etc.
|
|
|
|
* A given goal within a plan should be a logical unit of work that would take a senior developer 4-8 hours to implement.
|
|
|
|
* The final output of a goal in our PLAN should typically be under a thousand lines of code and make sense as a single PR.
|
|
|
|
* Always stop and ask clarifying questions after implementing `PLAN.md`. You should not move on to implementing the plan until you have gotten an explicit go-ahead instrcution from the user.
|
|
|
|
### Testing
|
|
|
|
Coming up with a good test is **always the first step** of any plan. Tests should validate that the behaviour of a module is correct, by interacting with its interface. If the module's internals change, the tests should still pass!
|
|
|
|
Tests should generally follow this pattern (pseudocode):
|
|
|
|
self.set_up()
|
|
example_input = {'example': 'input'}
|
|
self.assert_property(self.get('/v1/api/endpoint', example_input))
|
|
|
|
Not every interface will be a REST API, and not every test will be directly measurable property of the output like this.
|
|
|
|
But in all cases, the test should include a setup, a call to an interface, and a test of some assumption we will make about calling that interface.
|
|
|
|
We should NEVER USE MOCKS.
|
|
|
|
Interfaces should be as pure-functional as possible. Use dependency injection. If a function has a side-effect, then *the side-effect should be in the name of the function, and its main purpose* if at all possible.
|
|
|
|
**At the beginning of any new goal, ask the user questions until you understand what tests you should write.** Then write the tests and ask the user if they match their expectations.
|
|
|
|
### Code style
|
|
|
|
**CRITICAL RULES - NEVER VIOLATE THESE:**
|
|
|
|
* **KISS (Keep It Simple, Stupid)**: Always choose the simplest solution that solves the problem. Avoid clever tricks, complex abstractions, and over-engineering.
|
|
|
|
* **Readable & Production-Ready**: All code must be immediately readable by any developer and production-ready. No shortcuts, no "TODO" comments, no placeholder implementations.
|
|
|
|
* **NO BOILERPLATE**: Do not write any boilerplate code. Do not create classes, functions, or structures "for future use." Write only what is needed right now for the current goal.
|
|
|
|
* **NO COMMENTS**: Code must be self-documenting through clear naming and simple structure. Comments are forbidden (except docstrings for public APIs). If you think you need a comment, refactor the code to be clearer instead.
|
|
|
|
**Additional Guidelines:**
|
|
|
|
Write the minimum code needed to accomplish a task.
|
|
|
|
Do not build for untested contingencies. If you believe there is an edgecase or contingency that needs to be accomodated in your code, that case should be explicitly tested.
|
|
|
|
Functions should clearly identify their purpose through their names and signatures.
|
|
|
|
Simple is better than complex, explicit is better than implicit, and boring code is better than clever code.
|
|
|
|
No magic; don't abuse metaprogramming. When I read code its behaviour should be obvious to me.
|
|
|
|
Tests should effectively serve as documentation. By reading the tests in a pull request, I should be able to infer the PR's primary purpose. |