Skip to content

Models API Reference

Auto-generated API documentation for VBAgent Pydantic models.

Classification Models (v1)

classification

Classification result data model.

ClassificationResult

Bases: BaseModel

Result from the Classifier Agent.

Contains metadata extracted from a question image. Used as output_type for structured outputs with openai-agents SDK.

fix_classified_at classmethod

fix_classified_at(v)

Ensure classified_at is a timestamp

Classification Models (v2)

classification_v2

Enhanced classification models for multi-agent pipeline.

Version 2.0 of classification system with support for: - Multiple input modalities (image, LaTeX, ideas, combinations) - Detailed diagram analysis - Context-aware difficulty assessment - Problem generation and combination - TikZ validation

ClassificationResult

Bases: BaseModel

Complete classification result combining all agents.

This is the unified result that can be built incrementally as different agents complete their work.

fix_classified_at classmethod

fix_classified_at(v)

Ensure classified_at is a timestamp, not a string like 'image'

from_agents classmethod

from_agents(primary: PrimaryClassification, diagram: Optional[DiagramAnalysis] = None, difficulty: Optional[DifficultyAssessment] = None) -> ClassificationResult

Combine results from multiple agents

from_primary classmethod

from_primary(primary: PrimaryClassification) -> ClassificationResult

Create from primary classification only

CombinedProblem

Bases: BaseModel

Output from Agent 6: Multi-Problem Combiner

DiagramAnalysis

Bases: BaseModel

Output from Agent 2: Diagram Analyzer

DiagramFeatures

Bases: BaseModel

Visual features of the diagram

DifficultyAssessment

Bases: BaseModel

Output from Agent 3: Difficulty Assessor

DifficultyFactors

Bases: BaseModel

Factors contributing to difficulty

ExamRelevance

Bases: BaseModel

Relevance to different exams

GeneratedProblem

Bases: BaseModel

Output from Agent 5: Idea-to-Problem Generator

PrimaryClassification

Bases: BaseModel

Output from Agent 1 (Image Classifier) or Agent 4 (LaTeX Classifier)

fix_classified_at classmethod

fix_classified_at(v)

Ensure classified_at is a timestamp, not a string like 'image'

ProblemStructure

Bases: BaseModel

Structure of the problem

TikZError

Bases: BaseModel

A single TikZ error

TikZFix

Bases: BaseModel

A fix applied to TikZ code

TikZRequirements

Bases: BaseModel

TikZ generation requirements

TikZValidation

Bases: BaseModel

Output from Agent 7: TikZ Checker/Fixer

Scan Models

scan

Scan result data model.

ScanResult

Bases: BaseModel

Result from the Scanner Agent.

Contains extracted LaTeX and diagram information.

Idea Models

idea

Idea result data model.

IdeaResult

Bases: BaseModel

Result from the Idea Agent.

Contains extracted physics concepts and problem-solving ideas. Used as output_type for structured outputs with openai-agents SDK.

Review Models

review

Review data models for QA Review Agent.

Pydantic models for structured review suggestions and results.

ReviewIssueType

Bases: str, Enum

Type of issue found during QA review.

ReviewResult

Bases: BaseModel

Result from reviewing a problem.

Contains the overall review status and any suggestions.

ReviewStats

Bases: BaseModel

Statistics from review sessions.

Aggregated metrics across review sessions.

Suggestion

Bases: BaseModel

A suggested edit from the QA Review Agent.

Contains the issue details, reasoning, and a unified diff representing the proposed change.

Pipeline Models

pipeline

Pipeline result data model.

PipelineResult

Bases: BaseModel

Result from the full processing pipeline.

Contains all outputs from the pipeline stages.

Batch Models

batch

Batch processing database models.

SQLite-based tracking for batch image processing with resume capability.

BatchDatabase

BatchDatabase(base_dir: str = '.')

SQLite database for batch processing state.

Initialize database connection.

Parameters:

Name Type Description Default
base_dir str

Directory to store the database file

'.'

add_image

add_image(image_path: str) -> int

Add an image to the processing queue.

Returns the image ID.

close

close()

Close database connection.

get_alternates

get_alternates(image_id: int) -> list[str]

Get all alternates for an image.

get_config

get_config() -> Optional[dict]

Get batch configuration.

get_image

get_image(image_id: int) -> Optional[ImageRecord]

Get an image record by ID.

get_image_by_path

get_image_by_path(image_path: str) -> Optional[ImageRecord]

Get an image record by path.

get_pending_images

get_pending_images() -> list[ImageRecord]

Get all images that need processing.

get_stats

get_stats() -> dict

Get processing statistics.

get_variants

get_variants(image_id: int) -> dict[str, str]

Get all variants for an image.

reset_failed

reset_failed()

Reset failed images to pending status.

save_alternate

save_alternate(image_id: int, latex: str)

Save an alternate solution.

save_classification

save_classification(image_id: int, classification_json: str)

Save classification result.

save_config

save_config(images_dir: str, output_dir: str, variant_types: list[str], generate_alternates: bool, use_context: bool = True)

Save batch configuration.

save_ideas

save_ideas(image_id: int, ideas_json: str)

Save ideas JSON.

save_latex

save_latex(image_id: int, latex: str)

Save scanned LaTeX.

save_tikz

save_tikz(image_id: int, tikz_code: str)

Save TikZ code.

save_variant

save_variant(image_id: int, variant_type: str, latex: str)

Save a variant.

update_status

update_status(image_id: int, status: ProcessingStatus, stage: Optional[str] = None, error: Optional[str] = None)

Update image processing status.

ImageRecord dataclass

ImageRecord(id: int, image_path: str, status: ProcessingStatus, current_stage: Optional[str], error_message: Optional[str], created_at: datetime, updated_at: datetime, completed_at: Optional[datetime], classification_json: Optional[str] = None, latex: Optional[str] = None, tikz_code: Optional[str] = None, ideas_json: Optional[str] = None)

Record of an image in the batch processing queue.

ProcessingStatus

Bases: str, Enum

Status of an image in the processing pipeline.

Diff Models

diff

Diff utilities for QA Review Agent.

Functions for generating, parsing, and applying unified diffs. Uses Python's difflib for diff generation.

DiffError

Bases: Exception

Base exception for diff-related errors.

DiffErrorType

Bases: str, Enum

Types of errors that can occur during diff operations.

DiffResult dataclass

DiffResult(success: bool, error_type: Optional[DiffErrorType] = None, error_message: Optional[str] = None, original_preserved: bool = True)

Result of a diff application operation.

Attributes:

Name Type Description
success bool

Whether the operation succeeded

error_type Optional[DiffErrorType]

Type of error if failed, None if successful

error_message Optional[str]

Human-readable error message if failed

original_preserved bool

Whether the original file was preserved on failure

apply_diff

apply_diff(file_path: str, diff: str) -> bool

Apply a unified diff to a file.

Parameters:

Name Type Description Default
file_path str

Path to the file to modify

required
diff str

Unified diff string to apply

required

Returns:

Type Description
bool

True if successful, False otherwise

Note

For detailed error information, use apply_diff_safe() instead.

apply_diff_safe

apply_diff_safe(file_path: str, diff: str) -> DiffResult

Apply a unified diff to a file with detailed error handling.

This function provides comprehensive error handling and ensures the original file is preserved on failure.

Parameters:

Name Type Description Default
file_path str

Path to the file to modify

required
diff str

Unified diff string to apply

required

Returns:

Type Description
DiffResult

DiffResult with success status and error details if failed

apply_diff_to_content

apply_diff_to_content(original: str, diff: str) -> str | None

Apply a unified diff to content string.

This is useful for testing without file I/O.

Parameters:

Name Type Description Default
original str

Original content string

required
diff str

Unified diff string to apply

required

Returns:

Type Description
str | None

Modified content if successful, None if diff doesn't apply

check_file_modified

check_file_modified(file_path: str, expected_hash: str) -> bool

Check if a file has been modified since a hash was computed.

Parameters:

Name Type Description Default
file_path str

Path to the file to check

required
expected_hash str

Expected MD5 hash of the file content

required

Returns:

Type Description
bool

True if file has been modified (hash doesn't match), False otherwise

compute_file_hash

compute_file_hash(file_path: str) -> Optional[str]

Compute MD5 hash of a file's content.

Parameters:

Name Type Description Default
file_path str

Path to the file

required

Returns:

Type Description
Optional[str]

MD5 hash string, or None if file cannot be read

generate_diff

generate_diff(original: str, modified: str, filename: str = 'file.tex') -> str

Generate a unified diff between original and modified content.

Convenience wrapper around generate_unified_diff with sensible defaults.

Parameters:

Name Type Description Default
original str

Original file content

required
modified str

Modified file content

required
filename str

Filename for diff header (default: file.tex)

'file.tex'

Returns:

Type Description
str

Unified diff string, or empty string if no changes

generate_unified_diff

generate_unified_diff(original: str, modified: str, file_path: str, context_lines: int = 3) -> str

Generate unified diff between original and modified content.

Parameters:

Name Type Description Default
original str

Original content

required
modified str

Modified content

required
file_path str

Path to the file (used in diff header)

required
context_lines int

Number of context lines around changes

3

Returns:

Type Description
str

Unified diff string

parse_diff

parse_diff(diff: str) -> tuple[str, str] | None

Parse a unified diff to extract original and modified content.

Parameters:

Name Type Description Default
diff str

Unified diff string

required

Returns:

Type Description
tuple[str, str] | None

Tuple of (original_content, modified_content), or None if diff is empty

Version Store

version_store

Version Store for QA Review Agent.

SQLite-based storage for suggestions and review history with version tracking.

ProblemCheckStatus

Bases: str, Enum

Status of a problem in the check workflow.

StoredSuggestion dataclass

StoredSuggestion(id: int, version: int, problem_id: str, file_path: str, issue_type: str, description: str, reasoning: str, confidence: float, original_content: str, suggested_content: str, diff: str, status: SuggestionStatus, created_at: datetime, session_id: Optional[str])

A suggestion stored in the version store.

Contains all data needed to retrieve and apply a stored suggestion.

from_dict classmethod

from_dict(data: dict) -> StoredSuggestion

Create from dictionary (JSON deserialization).

to_dict

to_dict() -> dict

Convert to dictionary for JSON serialization.

SuggestionStatus

Bases: str, Enum

Status of a suggestion in the review workflow.

VersionStore

VersionStore(base_dir: str = '.')

SQLite-based storage for suggestions and review history.

Provides version tracking for rejected suggestions and statistics tracking for review sessions.

Initialize database connection.

Parameters:

Name Type Description Default
base_dir str

Directory to store the database file

'.'

clear_problem_checks

clear_problem_checks(output_dir: str) -> int

Clear all problem check entries for a directory.

Parameters:

Name Type Description Default
output_dir str

Output directory

required

Returns:

Type Description
int

Number of entries deleted

close

close()

Close database connection.

create_session

create_session() -> str

Create a new review session.

Returns:

Type Description
str

The session ID

delete_session

delete_session(session_id: str) -> bool

Delete a session and its associated suggestions.

Parameters:

Name Type Description Default
session_id str

The session ID to delete

required

Returns:

Type Description
bool

True if session was deleted, False if not found

get_checked_files

get_checked_files(checker_type: str, output_dir: str) -> set[str]

Get all files that have been checked by a specific checker.

Parameters:

Name Type Description Default
checker_type str

Type of checker

required
output_dir str

Output directory context

required

Returns:

Type Description
set[str]

Set of file paths that have been checked

get_checker_stats

get_checker_stats(checker_type: str, output_dir: str) -> dict

Get statistics for a specific checker in a directory.

Parameters:

Name Type Description Default
checker_type str

Type of checker

required
output_dir str

Output directory context

required

Returns:

Type Description
dict

Dictionary with total checked, passed, and failed counts

get_incomplete_sessions

get_incomplete_sessions() -> list[dict]

Get all incomplete (interrupted) sessions.

Returns:

Type Description
list[dict]

List of session data dictionaries for sessions without completed_at

get_pending_problems

get_pending_problems(output_dir: str, limit: Optional[int] = None) -> list[str]

Get list of pending problem IDs.

Parameters:

Name Type Description Default
output_dir str

Output directory to filter by

required
limit Optional[int]

Maximum number to return

None

Returns:

Type Description
list[str]

List of problem IDs with pending status

get_problem_check_stats

get_problem_check_stats(output_dir: str) -> dict

Get statistics for problem checks in a directory.

Parameters:

Name Type Description Default
output_dir str

Output directory to get stats for

required

Returns:

Type Description
dict

Dictionary with counts by status

get_problems_by_status

get_problems_by_status(output_dir: str, status: ProblemCheckStatus) -> list[str]

Get problem IDs with a specific status.

Parameters:

Name Type Description Default
output_dir str

Output directory to filter by

required
status ProblemCheckStatus

Status to filter by

required

Returns:

Type Description
list[str]

List of problem IDs

get_session

get_session(session_id: str) -> Optional[dict]

Get session details.

Parameters:

Name Type Description Default
session_id str

The session ID to retrieve

required

Returns:

Type Description
Optional[dict]

Session data as a dictionary, or None if not found

get_stats

get_stats(days: Optional[int] = None) -> dict

Get review statistics.

Parameters:

Name Type Description Default
days Optional[int]

Filter to last N days (optional)

None

Returns:

Type Description
dict

Dictionary with review statistics

get_suggestion

get_suggestion(suggestion_id: int) -> Optional[StoredSuggestion]

Get a specific suggestion by ID.

Parameters:

Name Type Description Default
suggestion_id int

The ID of the suggestion to retrieve

required

Returns:

Type Description
Optional[StoredSuggestion]

The stored suggestion, or None if not found

get_versions

get_versions(problem_id: Optional[str] = None, file_path: Optional[str] = None) -> list[StoredSuggestion]

Get version history for a problem or file.

Parameters:

Name Type Description Default
problem_id Optional[str]

Filter by problem ID (optional)

None
file_path Optional[str]

Filter by file path (optional)

None

Returns:

Type Description
list[StoredSuggestion]

List of stored suggestions matching the criteria

init_problem_checks

init_problem_checks(problem_ids: list[str], output_dir: str, reset: bool = False) -> int

Initialize problem check tracking for a list of problems.

Parameters:

Name Type Description Default
problem_ids list[str]

List of problem IDs to track

required
output_dir str

Output directory containing the problems

required
reset bool

If True, reset existing entries to pending

False

Returns:

Type Description
int

Number of problems initialized

is_file_checked

is_file_checked(file_path: str, checker_type: str, output_dir: str) -> bool

Check if a file has already been checked by a specific checker.

Parameters:

Name Type Description Default
file_path str

Path to the file

required
checker_type str

Type of checker

required
output_dir str

Output directory context

required

Returns:

Type Description
bool

True if the file has been checked, False otherwise

mark_file_checked

mark_file_checked(file_path: str, checker_type: str, output_dir: str, passed: bool = False) -> None

Mark a file as checked by a specific checker.

Parameters:

Name Type Description Default
file_path str

Path to the file that was checked

required
checker_type str

Type of checker (solution/grammar/clarity/tikz/alternate/idea)

required
output_dir str

Output directory context

required
passed bool

Whether the file passed the check without issues

False

reset_checker_progress

reset_checker_progress(checker_type: str, output_dir: str, file_paths: Optional[list[str]] = None) -> int

Reset checker progress for specific files or all files.

Parameters:

Name Type Description Default
checker_type str

Type of checker

required
output_dir str

Output directory context

required
file_paths Optional[list[str]]

Specific files to reset (None = all)

None

Returns:

Type Description
int

Number of entries deleted

reset_problem_checks

reset_problem_checks(output_dir: str, problem_ids: Optional[list[str]] = None) -> int

Reset problem checks to pending status.

Parameters:

Name Type Description Default
output_dir str

Output directory

required
problem_ids Optional[list[str]]

Specific problems to reset (None = all)

None

Returns:

Type Description
int

Number of problems reset

save_session_state

save_session_state(session_id: str, output_dir: str, remaining_problems: list[str]) -> None

Save session state for potential resume.

Parameters:

Name Type Description Default
session_id str

The session ID to update

required
output_dir str

The output directory being reviewed

required
remaining_problems list[str]

List of problem IDs not yet reviewed

required

save_suggestion

save_suggestion(suggestion: Suggestion, problem_id: str, status: SuggestionStatus, session_id: Optional[str] = None) -> int

Save a suggestion and return its ID.

Parameters:

Name Type Description Default
suggestion Suggestion

Suggestion object from review.py

required
problem_id str

ID of the problem being reviewed

required
status SuggestionStatus

Status to set for the suggestion

required
session_id Optional[str]

Optional session ID for tracking

None

Returns:

Type Description
int

The ID of the saved suggestion

update_problem_check

update_problem_check(problem_id: str, output_dir: str, status: ProblemCheckStatus, suggestion_count: int = 0) -> None

Update the check status of a problem.

Parameters:

Name Type Description Default
problem_id str

The problem ID

required
output_dir str

Output directory

required
status ProblemCheckStatus

New status

required
suggestion_count int

Number of suggestions found

0

update_session

update_session(session_id: str, problems_reviewed: Optional[int] = None, suggestions_made: Optional[int] = None, approved_count: Optional[int] = None, rejected_count: Optional[int] = None, skipped_count: Optional[int] = None, completed: bool = False) -> None

Update session statistics.

Parameters:

Name Type Description Default
session_id str

The session ID to update

required
problems_reviewed Optional[int]

Number of problems reviewed

None
suggestions_made Optional[int]

Number of suggestions made

None
approved_count Optional[int]

Number of approved suggestions

None
rejected_count Optional[int]

Number of rejected suggestions

None
skipped_count Optional[int]

Number of skipped suggestions

None
completed bool

Whether to mark the session as completed

False

update_status

update_status(suggestion_id: int, status: SuggestionStatus) -> None

Update the status of a suggestion.

Parameters:

Name Type Description Default
suggestion_id int

The ID of the suggestion to update

required
status SuggestionStatus

The new status

required