Structuring your data in a data warehouse, data streaming or data processing engine is a challenge, but an even greater challenge is to be able to add a new data source ingestion to an existing infrastructure.
At vertiv, we have addressed this issue by building platform components that allow you to plug and play any source into a canonical model. These canonical models are built using ontologies for the entities, so you do not have worry about complex mapping schemes to get your data stream ingested.
With new data ingestion comes new challenges with regards to data quality. With real-time data stream processing, vertiv is able to give you a real-time dashboard on the quality of data and can fix data issues based on the pre-defined business rules. With our newest addition of knowledge graphs and machine learning, our platform is continuously learning new ways of fixing data issues on the fly.
Some of key questions to address before we implement a data ingestion process for our client:
What are the existing data sources? Are all data sources on premise, in the cloud, or streams of data coming in real-time from external sources?
Is the data structured, un-structured or semi-structured?
Are there knowledge graphs already available for the data? Any existing ontology that can be applied to the entities?
What are the critical volumes and throughput expected?
What percentage of data will be processed through computations to generate analytical dashboards?
Does your organization want flexible deployment options - on cloud, on Premise or hybrid?
Based on the answers, we will build the right data ingestion framework and turn on the appropriate rules in our platform.