Developer Guide
Table of Contents
Overview
The GABM developer guide provides guidance for developers some of which are also maintainers.
Please follow the Developer Quick Start Guide to get set up to contribute as a developer.
In the rest of the document “you” means you as a GABM developer.
Contributing and Communicating
Please follow the CODE_OF_CONDUCT.md.
For the time being, please communicate by commenting on or raising new GABM Repository Issues.
You should have forked the GABM Repository to your own GitHub account.
The general workflow for contributing is to:
Create and check out new local branch:
git branch my_feature
git checkout my_feature
Make changes:
For each Python file edited, please adjust the Metadata at the top of the file by:
Adding your details to the
__author__listIncrementing the
__version__
Update the Change Log with a summary of your changes.
Commit changes with a clear message. For example:
git add .
git commit -m "Clear message that explains changes."
Ensure tests pass and documentation builds before submitting a PR:
make test
make docs
Push to your Fork, For example:
git push origin my_feature
Open a PR on GitHub to merge your
my_featurebranch into the themainbranch of the upstream GABM Repository.Please refer to any related issues in the PR comments.
The PR will be reviewed and once the review is complete, changes will be merged.
Project Directories
The root project directory contains documentation and files needed for building and deploying. The sub-directories include:
data/: For data including log filesdist/: Output directory for built distributions.docs/: Documentationscripts/: Utility scriptssrc/: Python source codetests/: Test suite for srcvenv-build-test/: For temporary virtual environments created for testing.
Testing
GABM uses pytest for unit and integration tests. The unit test suite is located in the tests/ directory, mirroring the module structure.
Running Tests
Run most tests with:
make test
or:
pytest
By default, tests marked as @pytest.mark.slow are excluded (see pytest.ini). Run slow tests with:
pytest -m slow
Adding Tests
Please add or update tests when modifying or adding features. Aim for high test coverage to catch regressions and ensure code reliability.
Mark slow or resource-intensive tests with @pytest.mark.slow and set a timeout if needed (e.g., @pytest.mark.timeout()).
Local LLM Tests
Tests for local LLMs (e.g., Apertus) are marked as slow and excluded by default as these tests require significant hardware resources (GPU recommended). On CPU-only machines, inference may be extremely slow or impractical.
To run local LLM tests, ensure your environment is suitable and use:
pytest -m slow
Test Workflow
PRs should pass the test workflow before being merged.
See
.github/workflows/test.ymlfor Continuous Integration (CI) details.
Useful pytest markers
@pytest.mark.slow: Marks tests as slow; skipped by default.@pytest.mark.timeout(seconds): Fails test if it exceeds the given duration.
Note on DeprecationWarnings
You may see DeprecationWarnings related to protobuf (e.g., Type google._upb._message.MessageMapContainer uses PyType_Spec with a metaclass that has custom tp_new).
These warnings originate from upstream libraries (e.g., Google APIs) and do not affect GABM functionality. They are expected to be resolved in future library updates. Please ignore them unless they cause test failures or runtime errors.
Python Package Entry Point
The main entry point for the GABM package is src/gabm/__main__.py. This allows the application to be run using:
python3 -m gabm
This approach is preferred over directly running a script (like run.py) as:
It enables Python to treat
gabmas a package, ensuring imports work correctly.It is the standard way to provide a command-line entry point for Python packages.
It makes the project ready for distribution and installation.
The __main__.py file is executed when you run python3 -m gabm from the project root (with src on the Python path). If you need to add or change the main application logic, edit src/gabm/__main__.py.
Please see the Python Packaging documentation for information about Python packaging.
Makefile Targets
The root directory contains a Makefile. This is set up to automate tasks using GNU Make. This section explains the rules or targets in the Makefile. Please ensure any changes to the Makefile are platform agnostic.
Target Chaining, DRY, and Consistency
For maintainability, all Makefile targets that depend on other build steps should use Make’s built-in dependency chaining (e.g., gh-pages-deploy: docs) rather than manually invoking $(MAKE) or shelling out to make within a target. This ensures:
Each target is only responsible for its own logic.
Consistency: all targets use the same build steps, and changes to one target (like
docs) automatically propagate to dependents (likegh-pages-deploy). The following is a presentation of the Makefile rules/targets:
Target |
Usage/Description |
|---|---|
|
Show available Makefile commands |
|
Run all tests (pytest) |
|
Build documentation (Sphinx) and clean auto-copied docs assets |
|
Build documentation (Sphinx) |
|
Remove auto-copied documentation files from docs/ |
|
Build and deploy documentation to GitHub Pages (runs scripts/gh-pages-deploy.py) |
|
Remove build/test artifacts and Python caches |
|
Delete all LLM caches and model lists (for a clean slate) |
|
Clean up merged local branches and prune deleted remotes |
|
Sync main branch with upstream |
|
Sync and rebase a feature/release branch onto main |
|
Run onboarding/setup for all LLMs (API key check, model lists, cache init) |
|
Tag and push a release (platform-agnostic) |
|
Delete a release tag locally and on remotes (origin, upstream) |
|
Build a distribution package for PyPI (python -m build) |
|
Build and test install the package in a fresh venv |
|
Upload the built package to PyPI (twine upload dist/*) |
|
Upload the built package to TestPyPI (twine upload –repository testpypi dist/*) |
|
Bump the project version everywhere (patch by default; use |
|
Run gabm using local source (PYTHONPATH=src) |
|
Run gabm using installed package |
All Python scripts used by Makefile targets are in the scripts/ directory and are named consistently with their Makefile targets (e.g., make docs-clean runs scripts/docs-clean.py).
The delete-release target automates deleting a release tag locally and on both remotes (origin, upstream).
Please refer to the Makefile for full details.
Developing Documentation
Markdown files in the root directory, each serving a specific purpose:
README.md: A general README that links to much of the documentation outlined below including the developer and user guides. It is included in the Sphinx documentation and forms the basis for the main Sphinx documentation page.
CODE_OF_CONDUCT.md: Provides details on expected behavior of developers and reporting procedures.
USER_GUIDE.md: Provides step-by-step instructions for users to get set, and should provide guidance for using GABM.
DEV_QUICKSTART.md Provides step-by-step instructions for developers to get set up for contributing to GABM development.
DEV_GUIDE.md: Explains how to contribute to GABM as a developer. It provides details about project structure, and workflow guidance.
API_KEYS.md: Explains how to create a file containing API keys for communicating with LLMs.
ROADMAP.md: Document to outline planned features and future development goals.
CHANGE_LOG.md: Document to describe changes and updates.
[DEVELOPMENT_HISTORY.md]: Document about development, milestones, and reflections.
If you add a new Markdown file in the root directory, please update entries to
doc_assets.pyDOC_FILES anddocs/index.mdProject Documents to include them in Sphinx documentation.
Sphinx Documentation
To build run:
make docs
This effectively runs
scripts/docs.pyto copy key documentation files from the project root to thedocs/directory, pre-processes them for building the Sphinx documentation, then delete those copied files once the build completes.
Note on Sphinx/MyST Documentation Warnings:
When building the documentation with Sphinx and MyST, you may see warnings like:
Document headings start at H2, not H1 [myst.header] document isn’t included in any toctree [toc.not_included] duplicate object description … use :no-index: for one of them WARNING: Field list ends without a blank line; unexpected unindent. [docutils] ‘myst’ cross-reference target not found: … [myst.xref_missing]
These warnings are common with Sphinx/Autosummary/MyST and do not affect the rendered documentation if the HTML output looks correct. In particular, myst.xref_missing means that a Markdown/MyST cross-reference (e.g., target) could not be resolved. If the documentation renders correctly and the missing reference is not critical, these can be safely ignored. You can safely ignore all these warnings unless the formatting in the HTML output is incorrect or a critical feature is missing.
Preview the Sphinx documentation by opening docs/_build/html/index.html in a Web browser.
To deploy the Sphinx Documentation run:
make gh-pages-deploy
This should deploy/update the gh-pages branch on your origin Fork. If your Fork uses the gh-pages branch for GitHub Pages, then the documentation should update there.
If it all looks good. Please submit a PR to incorporate documentation changes into main. A maintainer will subsequently update the upstream repository gh-pages branch.
Sphinx API Documentation and Autosummary
GABM uses Sphinx with the autosummary extension to generate API documentation for all major modules. To document new modules:
Import the new module in the appropriate
__init__.pyfile so Sphinx can discover it (e.g., addfrom .my_module import *).Add the module to the API Reference toctree in
docs/index.md(as an_autosummary/entry) to ensure it appears in the documentation navigation.You do not need to manually maintain individual
.rstfiles for each module; autosummary will generate them automatically duringmake docs.If you remove or rename modules, update both the
__init__.pyimports and the API Reference list indocs/index.md.You may delete stale
.rstfiles in_autosummaryif they are no longer referenced.
Preview the Sphinx documentation by opening docs/_build/html/index.html in a web browser.
Packaging and Deployment
The following files and directories are essential for building, testing, and distributing the GABM package:
pyproject.toml: Declares build system requirements and project metadata. Required for modern Python packaging (PEP 517/518).
setup.cfg: Contains static package metadata and configuration for setuptools, such as:
Package name, version, author, and description
Python version requirements
Entry points (e.g., console_scripts)
Classifiers and other options This file is preferred over setup.py for static, declarative configuration.
MANIFEST.in: Informs setuptools which additional files (beyond Python modules) to include in the source distribution (sdist). This should be updated so users who install from source get all necessary files.
requirements.txt: Lists pinned dependencies for end users (used by pip install -r requirements.txt).
requirements-dev.txt: Lists development dependencies (testing, linting, docs) with version ranges for contributors.
dist/: Output directory for built distributions (.tar.gz and .whl files) after running the build process.
src/gabm.egg-info/: Metadata directory created by setuptools during build. Contains information about the package (version, dependencies, etc.). Safe to delete; will be recreated as needed.
venv-build-test/: Temporary virtual environment created by
make build-testfor testing the built package in isolation. This can be safely deleted after testing.
Branch Protection
GitHub Actions workflows are used to help manage Pull Requests (PRs):
.github/workflows/test.ymlautomatically runsmake teston PRs to the main branch..github/workflows/gh-pages-deploy.ymlbuilds documentation for the gh-pages branch and is for automated docs deployment.
PRs to the GAB Repository main branch must pass the test workflow before merging.
The gh-pages branch is protected from deletion.
Maintainer Guide
This section is aimed at developers that are maintainers. Developers that are not maintainers are requested to refrain from publishing releases to PyPI so maintainers can ensure project integrity and security.
PyPI Release Process
To release a new version of GABM to PyPI, follow these steps:
Update Version: Run
make bump-versionto update the version everywhere (including setup.cfg, pyproject.toml, src/gabm/init.py, requirements-dev.txt, and all occurrences in User Guide). Usemake bump-version part=minororpart=majorfor non-patch bumps. Commit and push the changes before continuing.Build the Package:
make buildThis creates .tar.gz and .whl files in the dist/ directory.
Test the Build:
make build-testThis installs the built package in a clean environment and runs the test suite.
Upload to TestPyPI (optional but recommended):
make testpypi-releaseThis uploads the package to TestPyPI. Test installation from TestPyPI in a clean environment:
pip install --index-url https://test.pypi.org/simple/ gabm
You will need to have an account on https://test.pypi.org and have created an API key for this to work.
Upload to PyPI:
make pypi-releaseThis uploads the package to the official PyPI repository. You will need to have an account on https://pypi.org and have created an API key for this to work.
Verify Release:
Check the PyPI page and test installation as a user ensuring the instructions in the User Guide work.
To help recover from accidental corruption, template copies of the main packaging metadata files are provided:
setup.cfg.templatepyproject.toml.template
If you need to reset setup.cfg or pyproject.toml, copy these templates over the originals.
Note: These templates should be updated if you make structural changes to the originals. Bear in mind that the originals are processed to insert the dependency requirements from requirements.txt.
For more details, see the Python Packaging User Guide.
Branch and Documentation Deployment
Sphinx documentation is built and deployed using make gh-pages-deploy. If the upstream gh-pages branch is out of sync or needs to be replaced, use git push --force upstream gh-pages to overwrite it with the correct local version. This should be done with care, as it replaces the branch history.
When deploying documentation with make gh-pages-deploy, you may encounter an error like:
fatal: 'gh-pages' is already used by worktree at '/tmp/gh-pages-xxxx...'
This occurs if a previous deployment left a lingering worktree directory for the gh-pages branch. The deployment script should automatically check for and removes any existing worktree before adding a new one. If you encounter this error, manually remove the worktree:
git worktree remove /tmp/gh-pages-xxxx...
For more details, see the comments in the Makefile and the deployment script.
GitHub Copilot
Using GitHub Copilot) for vibe coding, can help with understanding workflows and developing documentation, code and tests.
GitHub Copilot uses limited context and the chat context does not currently persist between sessions. As a result, it is good to develop/update documentation along with changes. GitHub Copilot can be asked to read the README and Developer Guide at the start of a session so as to be more context aware and provide better support.
Managing Logs and Caches
Logs and caches (including prompt/response caches for LLM services) are generated during development and use. These files can become large. Typically they are not committed to the repository.
Project Python scripts use Python logging and write logs to:
data/logs/docs/ # Documentation logs
data/logs/llm/ # LLM module logs (OpenAI, DeepSeek, GenAI, etc.)
Each Python script and LLM module has its own log file to help:
Debug issues with documentation builds, asset copying, or LLM API calls
Review script actions and errors
Logging levels can be adjusted as needed.
If you encounter problems, please check relevant log files for details.
Files and Directories Excluded from Version Control
Certain files and directories are intentionally excluded from the repository via .gitignore to keep the project clean and secure:
data/logs/— All log files generated by scripts and modules (can be large and environment-specific)data/io/llm/*/prompt_response_cache.pkl— LLM response cache filesdata/api_key.csv— API keys (never commit secrets)
LLM Service Architecture
GABM uses a class architecture for integrating LLM services. All LLM modules (OpenAI, GenAI, DeepSeek, PublicAI, etc.) subclass a shared LLMService base class, which provides:
Generic caching and logging for prompt/response pairs
Consistent file naming and path management
Centralized error handling and environment variable setup
Model list writing utilities
To add a new LLM service:
Create a new class (e.g., MyLLMService) that subclasses LLMService.
Implement the send and list_available_models methods.
Use the base class helpers:
_pre_send_check_and_cache for API key, cache, and env setup
_call_and_cache_response for error handling and caching
_write_model_list for model list output
Example:
class MyLLMService(LLMService):
SERVICE_NAME = "myllm"
def send(self, api_key, message, model="default-model"):
cached = self._pre_send_check_and_cache(api_key, message, model)
if cached is not None:
return cached
cache_key = (message, model)
def api_call():
# Actual LLM API call here
return myllm_client.send(model=model, prompt=message)
return self._call_and_cache_response(api_call, cache_key, message, model, api_key)
Additional Resources
Apertus LLM Setup and Usage — for details on obtaining and using local Apertus LLM models