I’m speaking today about Intuit’s Commercial Graph at the Strata + Hadoop World Conference. Slides: Commercial Graph: A Map of Financial Relationships (pptx format).
Imagine the social graph where personal relationships are replaced by commercial relationships based on real financial data. Imagine the possibilities for small businesses to grow, connect, transact and prosper.
Intuit is uniquely qualified to achieve just this. We are entrusted with the collective data of 50 million consumers and small businesses. It is a unique pool of data that covers the financial spectrum – ranging from individual purchase history to business inventories.
At Intuit, we are building the Commercial Graph with the consumer and small business data from products like Mint.com, Quicken, and QuickBooks.
We take millions of user-entered, and hence unstructured, business descriptions and billions of transactions and apply Hadoop based deduplication algorithms for normalization, and machine learning for categorization. In order to better understand the graph, we compute metrics such as connected components, centrality, and commercial PageRank.
We will examine several applications of the commercial graph, including finding more customers like your best customers, optimizing your vendors, and relevant offers & recommendations to help our customers make and save money.
A deep-dive on technical architecture will discuss use of Giraph as a Hadoop based large scale graph processing platform and neo4j as a real-time graph datastore.