Sanity Check
Sanity Check
Quick Start
sanity_check(...) function performs a series of sanity checks to ensure the integrity of the dataset.
About the defined rules, please refer to requirement.md.
main.py
from t4_devkit.common import save_json, serialize_dataclass
from t4_devkit.sanity import sanity_check, print_sanity_result
result = sanity_check("<path/to/dataset>")
# display detailed results and summary
print_sanity_result(result)
# save result to JSON file if you want
save_json(serialize_dataclass(result), "result.json")
How to Add New Checkers
All checkers must follow:
- Implement a class that inherits from
Checkerclass. - Its ID must be unique and belong to one of
RuleGroupenum. - Override the
check() -> list[Reason] | Nonemethod to perform the specific check. - Register the checker using
CHECKERS.register()decorator.
str000.py
from __future__ import annotations
from typing import TYPE_CHECKING
from t4_devkit.sanity.checker import Checker, RuleID, RuleName, Severity
from t4_devkit.sanity.registry import CHECKERS
from t4_devkit.sanity.result import Reason
if TYPE_CHECKING:
from t4_devkit.sanity.context import SanityContext
@CHECKERS.register()
class STR000(Checker):
"""This is a custom checker."""
id = RuleID("STR000")
name = RuleName("my-custom-checker")
severity = Severity.ERROR
description = "This is a custom checker."
def check(self, context: SanityContext) -> list[Reason] | None:
# Return a list of reasons if the check fails, or None if it passes.
return None