DataHub allows you to enable simple and scalable access to distributed data for computation, and to publish a dataset and make it available to a specific community, or worldwide, across federated sites.
DataHub allows you to bring data close to computing to exploit it efficiently, and to publish a dataset and make it available to a specific community, or worldwide, across federated sites. DataHub is based on the Onedata technology.
You can access the service for evaluation using the PLAYGROUND shared space, or request support to publish your data and have dedicated storage assigned.
- Discovery of data via a central portal.
- Access to data conforming to required policies which may be:
- unauthenticated open access;
- access after user registration or
- access restricted to members of a Virtual Organization (VO).
- Access to data via GUI, POSIX, CDMI
- Replication of data from data providers for resiliency and availability purposes. Replication may take place either on-demand or automatically.
- Authentication and Authorization Infrastructure (AAI) integration between the EGI DataHub and with other EGI components and with user communities existing infrastructure.
- Metadata and shares management
- Data import and data caching based on file popularity
- Support for many backends (CEPH, S3, GlusterFS, POSIX, etc)
TRL 8: Actual system proven in operational environment.
Integrated data access services, computing and domain-specific processing tools for big data analysis. EGI Solution....