A centralized documentation hub that automatically aggregates, organizes, and maintains documentation from multiple source repositories.
This repository is a Documentation Aggregator that:
- Pulls documentation from multiple source repositories
- Organizes files by category (API, user guides, setup, etc.)
- Adds/updates YAML front matter for better metadata
- Maintains quality through automated checks and validation
- Provides a centralized, searchable documentation repository
Currently maintaining 2,700+ documentation files from multiple source repositories.
- Python 3.12+
- Git
- GitHub account (for Actions automation)
# Clone the repository
git clone https://github.com/bamr87/README.git
cd README
# Install dependencies
python3 -m pip install -r requirements.txt
# Configure source repositories
# Edit repos.txt and add your repository URLs (one per line)
nano repos.txt
# Run aggregation manually
bash scripts/aggregate.sh.
βββ docs/ # Organized documentation (2700+ files)
β βββ api/ # API documentation
β βββ architecture/ # Architecture guides
β βββ development/ # Development guides
β βββ misc/ # Miscellaneous docs
β βββ results/ # Test and analysis results
β βββ setup/ # Setup and installation guides
β βββ user-guides/ # User guides and tutorials
βββ scripts/ # Processing and utility scripts
β βββ aggregate.py # Main aggregation logic
β βββ aggregate.sh # Bash orchestration script
β βββ process.py # Document processing
β βββ check_frontmatter.py # Validate YAML front matter
β βββ clean_frontmatter.py # Normalize front matter
β βββ lint_docs.py # Markdown linting
β βββ fix_h1.py # Fix heading issues
β βββ fix_whitespace.py # Clean whitespace
β βββ generate_docs_report.py # Generate reports
βββ tests/ # Comprehensive test suite
β βββ unit/ # Unit tests
β βββ integration/ # Integration tests
β βββ fixtures/ # Test data and fixtures
βββ .github/workflows/ # GitHub Actions automation
β βββ aggregate-docs.yaml # Daily aggregation
β βββ docs-quality-check.yaml # Quality validation
β βββ docs-apply-fixes.yaml # Auto-fix issues
βββ repos.txt # Source repository list
βββ requirements.txt # Python dependencies
βββ README.md # This file
- GitHub Actions triggers daily (or manually)
- aggregate.sh reads
repos.txtand clones each repository - Python scripts discover and process documentation files:
- Extract markdown files
- Categorize content by topic
- Generate/update YAML front matter
- Organize into appropriate directories
- Quality checks validate the processed documentation
- Auto-fixes apply corrections where possible
- Changes are committed back to the repository
Documents are automatically categorized based on content analysis:
- api: API references, endpoints, REST documentation
- user-guides: Tutorials, how-tos, user documentation
- setup: Installation, configuration, getting started guides
- development: Development guides, contributor docs
- architecture: System architecture, design documents
- misc: Other documentation
Edit repos.txt to add repository URLs:
# Add repositories (one per line)
https://github.com/username/repo1
https://github.com/username/repo2
https://github.com/org/another-repoExamples from current configuration:
https://github.com/bamr87/scripts
https://github.com/bamr87/barodybroject
https://github.com/bamr87/bashcrawl
https://github.com/bamr87/ai-evolution-engine-seed
The repository includes three automated workflows:
File: .github/workflows/aggregate-docs.yaml
- Runs daily at midnight
- Can be triggered manually
- Clones source repos and processes documentation
bamr87 includes an integrated Wiki.js instance for modern, collaborative documentation management.
# Navigate to README directory
cd README
# Copy environment template
cp .env.example .env
# Start Wiki.js with Docker Compose
docker-compose up -d
# Access Wiki.js at http://localhost:3000- Modern Wiki Platform: Beautiful, responsive interface with real-time editing
- Multi-format Support: Markdown, HTML, AsciiDoc, and visual editor
- Full-text Search: Advanced search with filtering and indexing
- Version Control: Built-in Git integration and page history
- Access Control: User management with granular permissions
- Extensibility: Plugin system and GraphQL API
For complete setup instructions, configuration options, and usage guides, see:
- Wiki.js Setup Guide
- Official Docs: https://docs.requarks.io/
- Python: Scripts, libraries, and frameworks
- JavaScript/TypeScript: Web development, Node.js, React
- Go: Systems programming and cloud-native development
- Rust: Performance-critical applications
- Java: Enterprise applications and Android
File: .github/workflows/docs-quality-check.yaml
- Validates YAML front matter
- Lints markdown files
- Checks for formatting issues
File: .github/workflows/docs-apply-fixes.yaml
- Automatically fixes common issues
- Normalizes front matter
- Cleans whitespace
# Run full aggregation workflow
bash scripts/aggregate.sh
# Process documents with Python
python scripts/process.py# Check YAML front matter
python scripts/check_frontmatter.py
# Fix missing front matter fields
python scripts/check_frontmatter.py --fix
# Lint markdown files
python scripts/lint_docs.py
# Clean and normalize front matter
python scripts/clean_frontmatter.py
# Fix H1 heading issues
python scripts/fix_h1.py
# Fix whitespace issues
python scripts/fix_whitespace.py
# Generate documentation report
python scripts/generate_docs_report.py# Run all documentation checks
bash scripts/run_doc_checks.shThe repository includes a comprehensive testing framework:
# Run all tests
python tests/test_runner.py
# Run unit tests only
python tests/test_runner.py --type unit
# Run integration tests only
python tests/test_runner.py --type integration
# Test specific repositories
python tests/test_runner.py --type integration --repos https://github.com/user/repoSee tests/README.md for detailed testing documentation.
- Total Files: 2,700+ markdown documents
- Categories: 7 main categories
- Source Repositories: Multiple GitHub repositories
- Update Frequency: Daily automated aggregation
- Quality Checks: Automated linting and validation
Each processed document includes YAML front matter:
---
title: Document Title
tags: [tag1, tag2, tag3]
category: api
summary: Brief description of the document
source_repo: username/repository-name
---- Add your repository to
repos.txt - Run aggregation:
bash scripts/aggregate.sh - Review organized documentation in
docs/ - Submit a pull request
- Modify scripts in
scripts/directory - Add/update tests in
tests/ - Run tests:
python tests/test_runner.py - Submit a pull request
- Follow existing code style
- Add tests for new functionality
- Update documentation
- Ensure all tests pass before submitting
Python packages (from requirements.txt):
pyyaml>=6.0- YAML parsing and generationrequests>=2.31.0- HTTP requestsnltk>=3.8.1- Natural language processingpytest>=7.0- Testing framework
Permission Errors
chmod +x scripts/*.shDependency Issues
pip install -r requirements.txtRepository Access
- Ensure repositories in
repos.txtare public or you have access - Check GitHub token permissions for private repos
Empty Results
- Verify repository URLs are correct
- Check that source repos contain markdown files
- Review logs in GitHub Actions
- Product Requirements Document - Detailed MVP specifications
- Documentation Checks Guide - Quality tools overview
- Testing Framework - Comprehensive testing guide
- Testing Framework Details - Test architecture
This repository is personal infrastructure. Source documentation retains original licenses from respective repositories.
- Issues: Open a GitHub issue
- Documentation: Check the
docs/directory - Questions: Use GitHub Discussions
README - Centralized documentation aggregation and organization system.