Overview

Beta Notice: Bulk Data Access is currently in beta. To learn more about this feature and discuss your requirements, please get in touch with us.

OpenRegisters Bulk Data access allows you to run our API software on your own infrastructure, giving you complete control over scaling and data privacy. This solution is ideal for organizations that:

  • Need to process large volumes of company data
  • Have strict data privacy requirements
  • Want to integrate with internal datasets
  • Require unlimited API requests

Features

Self-Hosted Infrastructure

  • Deploy on your own servers or cloud infrastructure
  • Full control over scaling and performance
  • Keep sensitive data within your security perimeter
  • Integrate with existing systems seamlessly

Data Updates

  • Regular data updates from authoritative sources
  • Flexible update schedules
  • Delta updates to minimize bandwidth
  • Historical data preservation

Open Source Matching

  • Access to our company matching algorithms
  • Customize matching rules for your needs
  • Integrate with internal company databases
  • Maintain data quality standards

Getting Started

System Requirements

  • Linux/Unix-based operating system
  • Minimum 16GB RAM
  • 100GB storage (varies based on data volume)
  • PostgreSQL 13+
  • Docker support

Installation

  1. Download the OpenRegisters Docker images:
docker pull openregisters/api
docker pull openregisters/matcher
docker pull openregisters/updater
  1. Set up the database:
docker run openregisters/db-setup
  1. Configure your environment:
cp .env.example .env
# Edit .env with your configuration
  1. Start the services:
docker-compose up -d

Data Schema

The bulk data follows the same schema as our API responses:

{
  "companies": [
    {
      "id": "string",
      "name": "string",
      "register_number": "string",
      "jurisdiction": "string",
      "status": "string",
      "registered_address": {
        "street": "string",
        "city": "string",
        "postal_code": "string",
        "country": "string"
      },
      "registration_date": "string",
      "legal_form": "string",
      "business_purpose": "string",
      "directors": [...],
      "shareholders": [...]
    }
  ]
}

Best Practices

  1. Regular Updates

    • Schedule updates during off-peak hours
    • Implement delta updates for efficiency
    • Maintain data freshness
  2. Performance Optimization

    • Index frequently queried fields
    • Scale horizontally for high availability
    • Cache common queries
  3. Data Privacy

    • Implement access controls
    • Encrypt sensitive data
    • Regular security audits
  4. Integration

    • Use the matcher for deduplication
    • Maintain data lineage
    • Document custom implementations

Support

Enterprise customers with bulk data access receive:

  • Dedicated support channel
  • Implementation assistance
  • Custom feature development
  • Regular technical reviews

Contact our enterprise team for pricing and implementation details.