Source One's Data Scientist Explains Some of the Hurdles Encountered During the Development of www.SpendConsultant.com
Without a doubt the launch of Spend Consultant has been exciting for Source One, but developing the unique Spend Analysis as a Service platform has not been without its fair share of challenges. In part one of our interview with Source One’s Senior Data Scientist and Lead Architect, James Patounas, we discussed his role in the project as well as major milestones in the development. In this second installment, Patounas explains some of the obstacles he and his team had to scale to deliver a spend analysis tool that is built by procurement professionals for procurement professionals.
What were some of the hurdles that your team had to overcome?
In all honesty, the biggest hurdle in developing our Spend Analysis as a Service platform was not technological. It was developing the standard supplier taxonomy. We spend a lot of time addressing the nitty-gritty of spend classification to ensure the tool would 1, cleanse spend data quickly and 2, mirror how our spend analysis experts classify spend when preparing for a comprehensive go-to-market strategy. We wanted to make sure the taxonomy we developed was truly tailored to how the spend categories would be sourced.
Why not use an already developed taxonomy? Wouldn’t have that been easier?
There are quite a few popular industry standard taxonomies. These include, but are not limited to, the SIC (Standard Industrial Classification) system, NAICS (North American Industry Classification System), UNSPC (United Nations Standard Products and Services Code®), eCl@ss, eOTD (ECCMA Open Technical Dictionary), and RUS (Requisite Unified Schema). And they do each have their advantages and disadvantages that I won’t get into here, I will say that our team has found that these systems generally do not suit our purposes – because they are either too granular or do not approach classification in a manner that effectively facilitates procurement.
So, how did you go about addressing this hurdle?
After we decided to develop our own taxonomy, I aggregated and cleansed all the categorizations that we used in spend analyses since 1992. We underwent an interactive process that involved extensive internal discussions with folks on our team who have been conducting Spend Analyses for years, as well as category subject matter experts. We also had frequent conversations with external industry experts. Through all of that, we were able to identify what we felt was the most logical structure.
Of course, as is often the case with theoretical frameworks when they are applied to the real world, we found that our initial classification schema was far from ideal once we tested it against existing data. In particular, there were too many companies that caused exceptions within our decision tree at both the automated level and the manual level. Over an extended period of time, we continued to refine the tree until we ultimately arrived at the one we have today. One that we are very proud of.
What made this process unique?
A lot of the elements included in Spend Consultant were originated simply because we liked the idea of them. Unfortunately, because one of our stated goals was that everything should be verifiable and repeatable, you eventually have to define the concepts behind them. For instance, we have a metric included in our assessments called “Ease of Implementation”. Defining Ease of Implementation as a standardized notion turned out to be very difficult. For instance, one consideration put into quantifying the Ease of Implementation is the complexity of a project. Ironically, coming up with a definition for the complexity of something is actually quite complex.
In the next installment to our interview series with Source One’s Senior Data Scientist, we’ll review some of the key differentiators of Spend Consultant.