Quick Contact


Informatica 9.1: Big Data

Informatica 9.1: Big Data

What Is It?

The term “Big Data” is used to describe datasets which are too large to be handled with traditional data management tools.  The boundaries for what qualifies as Big Data are always moving as hardware and software provisions change, and as such a specially tailored approach is required for each organisation wishing to tackle their Big Data issues.

Where Does It Come From?

Big Data comes from everywhere and anywhere that intensive data capture work is being performed, be it servers generating access logs or atmospheric sensors taking constant measurements. Some of the most common sources of Big Data are: output from scientific experiments, collation of internet-accessible materials, and archives of audio or visual data.

How Can We Handle It?

Traditional technologies, such as relational databases and mono-threaded processing, can in some cases cope with Big Data through significant hardware upgrades, but this cannot be viewed as a long-term solution – the datasets are liable to grow more quickly than the hardware solutions advances. The alternative is to use specialised systems, designed and built to handle Big Data, to augment or replace existing ones. This challenge comes in two parts – storage and access provision to the Big Data, and processing and integration of the Big Data.

Storage and Access

Although older technologies are perfectly capable of storing huge quantities of data, they often struggle when asked to provide access to just a small portion of a Big Data dataset, or if asked to compare or correlate two Big Data datasets. There are many solutions to this issue, including distributed databases (Cassandra), massively parallel processing or MPP databases (Netezza), data virtualisation products (Informatica Data Services, Composite Information Server), and distributed file systems (Hadoop). Many of these products can actually be layered if required, or used to augment existing systems rather than replacing them completely.

Processing and Integration

When it comes to processing of Big Data, there are several companies offering solutions – IBM, EMC, Microsoft, Pentaho, Ab Initio, and of course Informatica. As a partner to Netezza, Symantec, and EMC; as the leading data integration software provider; and with provisions already in place for dealing with Big Data, Informatica is the first-pick for this task. With push-down optimisation to minimise unnecessary data transfer, session partitioning to make the best use of available processing power, and the ability to spread processing across grids made up of multiple servers, Informatica is well-equipped to handle Big Data.

Informatica:

  • If you would like to see Informatica empowering the Data-Centric 
    EnterpriseCLICK HERE
  • Agile Solutions offer a range of Informatica services and advice 
    tailored to suit your needs.CLICK HERE

Would you like to discuss your requirements?

Contact us today or call on 0141 332 9785 for a no obligation chat.