Preparing for big data without losing the legacy

ao link

Members

Contact

New to DataIQ?

Take our FREE data literacy indicator now

Unlock the power of data - take our FREE data literacy indicator now

“Organisations have this vision of what they want to do with data, which is increasingly to combine it with other external sources to get a bigger picture for their marketing, manufacturing, analysis and portfolio management, as well as risk, regulatory and pricing functions. Customer data is still the largest domain of what we see from a data quality perspective, but the objective is evolving from the traditional idea of the golden record and single customer view towards bringing in those other sources.”

So said Ed Wrazen,VP product management, big data, Trillium Software, in an interview with DataIQ shortly after the company’s customer conference in Stuttgart. Hosted by Porsche at its state of the art headquarters and museum, it was a fitting venue to underline Trillium’s theme of continual investment and innovation in technology.

In February, the vendor launched Trillium Refine, a big data preparation solution that combines its deep heritage in data quality management with the new Hadoop and Spark-based environment in which data is being accessed, prepared, improved and linked, all wrapped in a data governance layer to support close monitoring of access and usage.

“A lot of those queries and the data sources they draw on are not structured, so they don’t fit the data model that exists within the data warehouse,” noted Wrazen. “There is also the increasing demand for complex data visualisation which is more dynamic and ad-hoc. So what we are seeing is the analytics and data science organisation wanting access to information in more complex, quicker and more dynamic ways than ever before.”

Decision makers are also increasingly looking to deploy in a more dynamic way using cloud, software-as-a-service or web-based solutions. Said Wrazen: “Different companies have different needs, but increasingly companies do not have the time to invest in a permanent infrastructure - they need an approach that can deliver what they need more flexibly, and that is easier to achieve through a cloud-based service.”

Trillium Refine was developed in part based around feedback from the company’s quarterly BI/analyst user forum. They are struggling with data preparation issues in big data - access, consolidation, cleansing, integration - before they get on to the work they want to do, such as segmentation or predictive modelling. With its launch last quarter, Trillium gives them the ability to see data sources, search and pull data across within a common environment.

“Users told us they often can’t access data when they want it, they rely on IT for extracts. When they do get it, it might be missing key variables or not in the right format. So users have to spend a lot of time getting their data fit before they use it. So we are helping to put data into the hands of users who can use web-based tools to validate, cleanse and integrate it,” he said.

Trillium’s users can now pull in data from Twitter or Facebook straight to the Refine platform and parse it out into a format that can be used and analysed. That would be very difficult using conventional ETL tools. At the same time, the solution generates dashboard reports on these data flows which give an insight into compliance and governance.

“Chief data officers have been pushing for monitoring and control of access with privileges to view and use, as well as ensuring that users have complete and trusted sources,” explained Wrazen. This also helps to fix another problem that poor data quality or inconsistent integration can cause. “From the business side, they have lacked trust in data. When they get a report and compare it to an output from the data warehouse, they don’t correlate. We have seen that many times with self-service analytics - the numbers don’t stack up, they lack reliability and accuracy, so users lose trust.”

Trillium has partnerships with Qlik and Tableau to support data quality processes behind those data visualisation tools. Refine is also built to store, run and process all the integration and quality procedures on Hadoop environments like Cloudera and Hortonworks.

Despite responding to the demands of lines of business, analysts and business intelligence functions for better data preparation in big data environments, Wrazen believes the case for this new view may have been over-stated. “Big data has been slower than most analysts predicted. Organisations have huge investments in their IT and some are still not convinced of the benefit of moving off their existing infrastructure into a new environment like Hadoop. Often, it is because they do not have the hardcore knowledge of how to operate that in a large-scale system,” he said.

Any solution aimed at supporting fit-for-purpose data - whatever that purpose may be - needs to be flexible enough to cope with both the emerging big data sources and legacy systems. As Wrazen points out: “Companies have still got their traditional management information drawing from the data warehouse working to pre-defined data specifications. Those are run and supported by the IT department with ETL bringing in data feeds in a very static way.“

Trillium has responded to where the cutting edge of analytics and insight is now leading business, but it has a careful eye on its existing core business. Wrazen noted: “We see big data as just another platform. It has great potential, but not for everybody.”

Log in to read the entire article

Gain access to the entire article by logging in or registering for a free account here.

Did you find this content useful?

Thank you for your input

Thank you for your feedback

Next read

Key data leader challenges in 2024: Part one – Foundations

DataIQ’s Research Analyst, Rachael Pimblett, shares the findings on what data leaders feel will be their main challenges in the next year, presented in the first of a four-part article series.

Next read

Key data leader challenges in 2024: Part one – Foundations

30 Apr 2024by Rachael Pimblett

DataIQ’s Research Analyst, Rachael Pimblett, shares the findings on what data leaders feel will be their main challenges in the next year, presented in the first of a four-part article series.

A case of the AI biter bit?

23 Apr 2024by David Reed

DataIQ’s Chief Knowledge Officer and Evangelist, David Reed, examines the hype cycle around generative AI and the actual speed of transformation being seen.

Pioneering AI initiatives revealed: DataIQ Announces 2024 AI Awards Shortlist

15 Apr 2024by Alex Roberts

The shortlist for the 2024 DataIQ AI Awards has been unveiled, with the winners to be announced at the DataIQ Summit on May 21.

You may also be interested in

Data Literacy versus Data Culture – DataIQ’s view

DataIQ is a trading name of IQ Data Group Limited
10 York Road, London, SE1 7ND

We use cookies so we can provide you with the best online experience. By continuing to browse this site you are agreeing to our use of cookies. Click on the banner to find out more.

Cookie Settings