Skip to content

bamr87/README

Documentation Aggregator

A centralized documentation hub that automatically aggregates, organizes, and maintains documentation from multiple source repositories.

🎯 What is this?

This repository is a Documentation Aggregator that:

  • Pulls documentation from multiple source repositories
  • Organizes files by category (API, user guides, setup, etc.)
  • Adds/updates YAML front matter for better metadata
  • Maintains quality through automated checks and validation
  • Provides a centralized, searchable documentation repository

Currently maintaining 2,700+ documentation files from multiple source repositories.

πŸš€ Quick Start

Prerequisites

  • Python 3.12+
  • Git
  • GitHub account (for Actions automation)

Setup

# Clone the repository
git clone https://github.com/bamr87/README.git
cd README

# Install dependencies
python3 -m pip install -r requirements.txt

# Configure source repositories
# Edit repos.txt and add your repository URLs (one per line)
nano repos.txt

# Run aggregation manually
bash scripts/aggregate.sh

πŸ“ Repository Structure

.
β”œβ”€β”€ docs/                    # Organized documentation (2700+ files)
β”‚   β”œβ”€β”€ api/                # API documentation
β”‚   β”œβ”€β”€ architecture/       # Architecture guides
β”‚   β”œβ”€β”€ development/        # Development guides
β”‚   β”œβ”€β”€ misc/               # Miscellaneous docs
β”‚   β”œβ”€β”€ results/            # Test and analysis results
β”‚   β”œβ”€β”€ setup/              # Setup and installation guides
β”‚   └── user-guides/        # User guides and tutorials
β”œβ”€β”€ scripts/                # Processing and utility scripts
β”‚   β”œβ”€β”€ aggregate.py        # Main aggregation logic
β”‚   β”œβ”€β”€ aggregate.sh        # Bash orchestration script
β”‚   β”œβ”€β”€ process.py          # Document processing
β”‚   β”œβ”€β”€ check_frontmatter.py    # Validate YAML front matter
β”‚   β”œβ”€β”€ clean_frontmatter.py    # Normalize front matter
β”‚   β”œβ”€β”€ lint_docs.py        # Markdown linting
β”‚   β”œβ”€β”€ fix_h1.py           # Fix heading issues
β”‚   β”œβ”€β”€ fix_whitespace.py   # Clean whitespace
β”‚   └── generate_docs_report.py # Generate reports
β”œβ”€β”€ tests/                  # Comprehensive test suite
β”‚   β”œβ”€β”€ unit/              # Unit tests
β”‚   β”œβ”€β”€ integration/       # Integration tests
β”‚   └── fixtures/          # Test data and fixtures
β”œβ”€β”€ .github/workflows/      # GitHub Actions automation
β”‚   β”œβ”€β”€ aggregate-docs.yaml         # Daily aggregation
β”‚   β”œβ”€β”€ docs-quality-check.yaml     # Quality validation
β”‚   └── docs-apply-fixes.yaml       # Auto-fix issues
β”œβ”€β”€ repos.txt              # Source repository list
β”œβ”€β”€ requirements.txt       # Python dependencies
└── README.md             # This file

πŸ”„ How It Works

Automated Workflow

  1. GitHub Actions triggers daily (or manually)
  2. aggregate.sh reads repos.txt and clones each repository
  3. Python scripts discover and process documentation files:
    • Extract markdown files
    • Categorize content by topic
    • Generate/update YAML front matter
    • Organize into appropriate directories
  4. Quality checks validate the processed documentation
  5. Auto-fixes apply corrections where possible
  6. Changes are committed back to the repository

Categorization

Documents are automatically categorized based on content analysis:

  • api: API references, endpoints, REST documentation
  • user-guides: Tutorials, how-tos, user documentation
  • setup: Installation, configuration, getting started guides
  • development: Development guides, contributor docs
  • architecture: System architecture, design documents
  • misc: Other documentation

βš™οΈ Configuration

Adding Source Repositories

Edit repos.txt to add repository URLs:

# Add repositories (one per line)
https://github.com/username/repo1
https://github.com/username/repo2
https://github.com/org/another-repo

Examples from current configuration:

https://github.com/bamr87/scripts
https://github.com/bamr87/barodybroject
https://github.com/bamr87/bashcrawl
https://github.com/bamr87/ai-evolution-engine-seed

GitHub Actions

The repository includes three automated workflows:

1. Documentation Aggregation

File: .github/workflows/aggregate-docs.yaml

  • Runs daily at midnight
  • Can be triggered manually
  • Clones source repos and processes documentation

Wiki.js Integration

bamr87 includes an integrated Wiki.js instance for modern, collaborative documentation management.

Quick Start

# Navigate to README directory
cd README

# Copy environment template
cp .env.example .env

# Start Wiki.js with Docker Compose
docker-compose up -d

# Access Wiki.js at http://localhost:3000

Features

  • Modern Wiki Platform: Beautiful, responsive interface with real-time editing
  • Multi-format Support: Markdown, HTML, AsciiDoc, and visual editor
  • Full-text Search: Advanced search with filtering and indexing
  • Version Control: Built-in Git integration and page history
  • Access Control: User management with granular permissions
  • Extensibility: Plugin system and GraphQL API

Documentation

For complete setup instructions, configuration options, and usage guides, see:

Programming Languages

  • Python: Scripts, libraries, and frameworks
  • JavaScript/TypeScript: Web development, Node.js, React
  • Go: Systems programming and cloud-native development
  • Rust: Performance-critical applications
  • Java: Enterprise applications and Android

2. Quality Checks

File: .github/workflows/docs-quality-check.yaml

  • Validates YAML front matter
  • Lints markdown files
  • Checks for formatting issues

3. Auto-Fixes

File: .github/workflows/docs-apply-fixes.yaml

  • Automatically fixes common issues
  • Normalizes front matter
  • Cleans whitespace

πŸ› οΈ Available Scripts

Core Processing

# Run full aggregation workflow
bash scripts/aggregate.sh

# Process documents with Python
python scripts/process.py

Quality Tools

# Check YAML front matter
python scripts/check_frontmatter.py

# Fix missing front matter fields
python scripts/check_frontmatter.py --fix

# Lint markdown files
python scripts/lint_docs.py

# Clean and normalize front matter
python scripts/clean_frontmatter.py

# Fix H1 heading issues
python scripts/fix_h1.py

# Fix whitespace issues
python scripts/fix_whitespace.py

# Generate documentation report
python scripts/generate_docs_report.py

Batch Operations

# Run all documentation checks
bash scripts/run_doc_checks.sh

πŸ§ͺ Testing

The repository includes a comprehensive testing framework:

# Run all tests
python tests/test_runner.py

# Run unit tests only
python tests/test_runner.py --type unit

# Run integration tests only
python tests/test_runner.py --type integration

# Test specific repositories
python tests/test_runner.py --type integration --repos https://github.com/user/repo

See tests/README.md for detailed testing documentation.

πŸ“Š Documentation Statistics

  • Total Files: 2,700+ markdown documents
  • Categories: 7 main categories
  • Source Repositories: Multiple GitHub repositories
  • Update Frequency: Daily automated aggregation
  • Quality Checks: Automated linting and validation

πŸ” Front Matter Structure

Each processed document includes YAML front matter:

---
title: Document Title
tags: [tag1, tag2, tag3]
category: api
summary: Brief description of the document
source_repo: username/repository-name
---

🀝 Contributing

Adding Documentation

  1. Add your repository to repos.txt
  2. Run aggregation: bash scripts/aggregate.sh
  3. Review organized documentation in docs/
  4. Submit a pull request

Improving Scripts

  1. Modify scripts in scripts/ directory
  2. Add/update tests in tests/
  3. Run tests: python tests/test_runner.py
  4. Submit a pull request

Development Guidelines

  • Follow existing code style
  • Add tests for new functionality
  • Update documentation
  • Ensure all tests pass before submitting

πŸ“‹ Requirements

Python packages (from requirements.txt):

  • pyyaml>=6.0 - YAML parsing and generation
  • requests>=2.31.0 - HTTP requests
  • nltk>=3.8.1 - Natural language processing
  • pytest>=7.0 - Testing framework

πŸ› Troubleshooting

Common Issues

Permission Errors

chmod +x scripts/*.sh

Dependency Issues

pip install -r requirements.txt

Repository Access

  • Ensure repositories in repos.txt are public or you have access
  • Check GitHub token permissions for private repos

Empty Results

  • Verify repository URLs are correct
  • Check that source repos contain markdown files
  • Review logs in GitHub Actions

πŸ“– Additional Documentation

πŸ“„ License

This repository is personal infrastructure. Source documentation retains original licenses from respective repositories.

πŸ†˜ Support

  • Issues: Open a GitHub issue
  • Documentation: Check the docs/ directory
  • Questions: Use GitHub Discussions

README - Centralized documentation aggregation and organization system.

About

No description, website, or topics provided.

Resources

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors