Documentation Index
Fetch the complete documentation index at: https://docs.jinba.io/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Jinba Modules provide powerful data processing capabilities including extraction, parsing, and validation. These tools use advanced AI and machine learning techniques to handle complex data transformation tasks with high accuracy and flexibility.Key Features
JINBA_MODULES_EXTRACT
- AI-powered data extraction from various sources
- Configurable extraction modes (FAST, BALANCED, QUALITY)
- User-defined JSON schema support
- Intelligent content recognition and parsing
JINBA_MODULES_PARSE
- Advanced document and data parsing
- Structure recognition and preservation
- Multi-format support
- Context-aware content interpretation
JINBA_MODULES_CHECKER_V2
- Enhanced data validation using JSON rules
- Complex rule engine with multiple validation types
- Detailed validation reporting
- Improved performance and accuracy
Authentication
No authentication required for Jinba Modules tools.Example: Intelligent Document Extraction
Example: Batch Document Processing
Extraction Modes
FAST Mode
- Speed: Fastest processing
- Accuracy: Good for simple documents
- Use cases: High-volume, simple document processing
- Processing time: ~1-3 seconds per document
BALANCED Mode (Recommended)
- Speed: Moderate processing speed
- Accuracy: High accuracy for most documents
- Use cases: General-purpose document processing
- Processing time: ~3-8 seconds per document
QUALITY Mode
- Speed: Slower but thorough processing
- Accuracy: Highest accuracy for complex documents
- Use cases: Critical documents, complex layouts
- Processing time: ~8-15 seconds per document
Data Schema Design
Basic Schema Structure
Advanced Schema Features
- Nested objects: Complex data structures
- Arrays: Multiple items of the same type
- Conditional fields: Fields dependent on other values
- Pattern matching: Regex validation
- Format validation: Date, email, URL formats
Validation Rules
Field-level Validation
- Type checking: String, number, boolean, array, object
- Range validation: Min/max values for numbers
- Length validation: Min/max length for strings
- Format validation: Email, date, URL patterns
- Enum validation: Allowed values from a list
Document-level Validation
- Required fields: Mandatory data presence
- Cross-field validation: Rules spanning multiple fields
- Business logic: Custom validation rules
- Consistency checks: Data coherence validation
Use Cases
- Invoice Processing: Automated invoice data extraction and validation
- Document Digitization: Convert paper documents to structured data
- Data Migration: Extract data from legacy systems
- Compliance Checking: Validate documents against regulations
- Research Data: Extract structured data from research documents
- Form Processing: Automate form data extraction
- Contract Analysis: Extract key terms from contracts
- Financial Processing: Process financial statements and reports
Best Practices
Schema Design
- Keep schemas simple and focused
- Use clear, descriptive field names
- Include comprehensive descriptions
- Test schemas with sample data
- Version your schemas for consistency
Extraction Optimization
- Choose appropriate extraction mode for your use case
- Provide high-quality input documents
- Use consistent document formats when possible
- Monitor extraction accuracy and adjust as needed
Validation Strategy
- Implement layered validation (field → document → business)
- Provide clear error messages
- Log validation results for analysis
- Continuously improve validation rules based on results
Performance Considerations
- Batch similar documents together
- Use FAST mode for simple, high-volume processing
- Monitor processing times and adjust extraction modes
- Implement error handling for failed extractions
Jinba Toolbox