Bulk Data Access (Beta) - OpenRegisters Docs

Overview

Beta Notice: Bulk Data Access is currently in beta. To learn more about this feature and discuss your requirements, please get in touch with us.

OpenRegisters Bulk Data access allows you to run our API software on your own infrastructure, giving you complete control over scaling and data privacy. This solution is ideal for organizations that:

Need to process large volumes of company data
Have strict data privacy requirements
Want to integrate with internal datasets
Require unlimited API requests

Features

Self-Hosted Infrastructure

Deploy on your own servers or cloud infrastructure
Full control over scaling and performance
Keep sensitive data within your security perimeter
Integrate with existing systems seamlessly

Data Updates

Regular data updates from authoritative sources
Flexible update schedules
Delta updates to minimize bandwidth
Historical data preservation

Open Source Matching

Access to our company matching algorithms
Customize matching rules for your needs
Integrate with internal company databases
Maintain data quality standards

Getting Started

System Requirements

Linux/Unix-based operating system
Minimum 16GB RAM
100GB storage (varies based on data volume)
PostgreSQL 13+
Docker support

Installation

Download the OpenRegisters Docker images:

docker pull openregisters/api
docker pull openregisters/matcher
docker pull openregisters/updater

Set up the database:

docker run openregisters/db-setup

Configure your environment:

cp .env.example .env
# Edit .env with your configuration

Start the services:

docker-compose up -d

Data Schema

The bulk data follows the same schema as our API responses:

{
  "companies": [
    {
      "id": "string",
      "name": "string",
      "register_number": "string",
      "jurisdiction": "string",
      "status": "string",
      "registered_address": {
        "street": "string",
        "city": "string",
        "postal_code": "string",
        "country": "string"
      },
      "registration_date": "string",
      "legal_form": "string",
      "business_purpose": "string",
      "directors": [...],
      "shareholders": [...]
    }
  ]
}

Best Practices

Regular Updates
- Schedule updates during off-peak hours
- Implement delta updates for efficiency
- Maintain data freshness
Performance Optimization
- Index frequently queried fields
- Scale horizontally for high availability
- Cache common queries
Data Privacy
- Implement access controls
- Encrypt sensitive data
- Regular security audits
Integration
- Use the matcher for deduplication
- Maintain data lineage
- Document custom implementations

Support

Enterprise customers with bulk data access receive:

Dedicated support channel
Implementation assistance
Custom feature development
Regular technical reviews

Contact our enterprise team for pricing and implementation details.

Getting Started

Integration

API Reference

​Overview

​Features

​Self-Hosted Infrastructure

​Data Updates

​Open Source Matching

​Getting Started

​System Requirements

​Installation

​Data Schema

​Best Practices

​Support

Overview

Features

Self-Hosted Infrastructure

Data Updates

Open Source Matching

Getting Started

System Requirements

Installation

Data Schema

Best Practices

Support