Price for this course



Classroom IBM


Available dates

Mon22Mar 21 TO Tue23Mar 21


Tech Data ILO UK
Connection details will be communicated separately
Instructor Led


Mon21Jun 21 TO Tue22Jun 21


Tech Data ILO UK
Connection details will be communicated separately
Instructor Led




The IBM InfoSphere Big Match on Hadoop course will introduce students to the Probabilistic Matching Engine (PME) and how it can be used to resolve and discover entities across multiple data sets in Hadoop.
Students will learn the basics of a PME algorithm including data model configuration, standardization, comparison and bucketing functions, weight generation, and threshold.
During the exercises, the student will work on a large use case, where they will apply their knowledge of Big Match to discover relationships be two data sets that can be used to understand the full view of the member data.


The course is designed for a technical audience that will be setting up a custom algorithm for the Probabilistic Matching Engine to use Big Match on Apache Hadoop to compare, match and/or search member records across multiple data sets.


This course has no pre-requisites.


  1. Understand the capabilities of the Probabilistic Matching Engine
  2. Understand how the Probabilistic Matching engine is used with Big Insights to solve certain use cases.
  3. Understand the technical framework of the Big Match solution and how member data is derived, bucketed and compared to produce a complete entity from multiple data sets.
  4. Create a project and data model using the Big Match Console
  5. Configure the HBase tables that will be used in a Big Match solution
  6. Configure an algorithm using he Big Match console that includes Standardization, Comparison and Bucketing functions.
  7. Set up Strings for Anonymous value, Equivalency values, Frequency values, and character maps using the Big Match console
  8. Set up and run the Weight Generation process
  9. Evaluate and set thresholds for the algorithm
  10. Deploy a new algorithm to Big Match
  11. Evaluate Entity results and reconfigure algorithm based on evaluation. E.g. Large Buckets, Large Entities, Member not belonging to any buckets, etc

Course Outline

1. Introduction to Big Match for Apache Hadoop
- What is Big Match
- How Big Match Works
- Big Match Components
- Big Match Architecture
2. Big Match Data Model Definition
- Members
- Attribute Types
- Member Attributes
- Sources
- Information Sources
3. PME Algorithm
- Standardization
- Bucketing
- Comparison Functions
4. Bucket Analysis
- Bucket Optimization
- Bucket Concerns
5. Weights
- String Weights
- Numeric Weights
- Multi-dimensional Weights
- Troubleshooting Weights
6. HBase Tables
- HBase concepts
- Big Match commands
- Big Match Tables (.pmebktidx, .pmemdmidx, .pmeentidx)
- Best Practices
7. BigMatch Applications
- PME Derive
- PME Compare
- PME Link
- PME Analysis


What do I need to bring with me to my public class?

All required learning materials and equipment are provided in the classroom.





When do public training course fees have to be paid?

For public training classes payment must be received no later than three business days prior to the first day of class in order to remain in the class and confirm your seat. Failure to provide payment by this date may result in removal from the class, and/or late cancellation fees applied. You can submit payment in the form of a Purchase Order or credit card.





On-site (private) Course Pricing:

To find out more about On-site training e-mail us at enablement@agilesolutions.co.uk or call one of our offices.





What is the cancellation policy?

Requests for cancellations or date transfers need to be received at least ten (10) business days prior to the event start date in order to receive a full refund. If a cancellation or reschedule request is received less than ten (10) business days before the start date, the penalty of 100% of the cost of the course will be applied, resulting in no amount of the fee being refunded. Refunds will not be allowed for “no-shows” in our public training or IVA courses. This cancellation policy is strictly enforced.





What happens if Agile Solutions needs to cancel or reschedule a course?

Agile Solutions reserves the right to cancel events for any reason at any time. Cancellation liability for Agile Solutions, if Agile Solutions cancels the course, is limited to the return of course payment ONLY. Agile Solutions will not reimburse registrants for any other costs including but not limited to any travel cancellation fees or penalties, including airfare and hotel costs. PLEASE NOTE: If your registration status is either “Approved”, or “Pending Payment” you have not been confirmed for the class and it is recommended that you do not make any travel arrangements until you have received a confirmation e-mail letting you know the class and registration is confirmed.





How will I know if my course has been rescheduled?

Agile Solutions reserves the right to reschedule or cancel a course due to low enrollment or if necessitated by other circumstances. Agile Solutions will contact you via e-mail or phone to inform you of the change of schedule. Once you have been notified you may reschedule or receive a full credit. Agile Solutions shall not be liable for any other costs including but not limited to any non-refundable travel arrangements if a course is rescheduled or cancelled.