<question>
I'd like to hear your opinion on the subject of metadata management in the context of production of SOA artifacts.
It is clear that the advantage of defining upfront a process that becomes executable removes the burden of decomposing it into messages and behaviours to be allocated to the existing IT silos, which would imply a strong risk of loosing track of the process itself. That is the mission of BPMN getting translated to an executable representation *such as BPEL or XPDL
This approach allows us to progressively refine the process' activities until we get to a level of detail at which activities can be mapped to simple logic to be performed by the bpm engine itself or to calls to specific external services.
But, what happens to data?
Data exists in the form of xml schemas, or database table relationships, all this is very low level, and comes into play at the very end of the development process. In a huge enterprise, I guess, some sort of common datamodel abstraction should be created so that the early versions of the BPMN diagram refer to data that is understandable at the correct level of abstraction (i.e the notion of "Customer"), in order to be refined, afterwards in specific domains and data formats (of type prepaid identified by an MSISDN of format xxx/xxxxxxx stored in application XYZ as a row in a specific db to be exsposed as a data service).
How would you address the need to store this vertical relationship of data concepts (from business data, to specific domain data, down to application instance data schemas) in a way that allows the business process modeler to refer to the customer and the domain analyst to narrow down on the specific type of customer and the developer to map that to the correct schema?
Where would you put this sort of relationship and shouldn't it be integrated with the development tools somehow?
I hope to have made my self clear.
Regards.
</question>
Good and huge question!
Your general approach is correct. You are also right that there is a vertical relationship (business objects, business data and application data) and not every BPM tool provides good tools for handling such a relationship.
Please, have a look at the illustration below -- this is a multi-layer model from my book.
http://www.improving-bpm-systems.com/book/2-the-architectural-framework-for-improving-bpm/approach-framework.png/image_view_fullscreen
Each layer is a level of abstraction of the business and addresses some particular concerns. Let us discuss the bottom four layers here.
• The business execution layer carries out the business processes. Any process comprises one or more business activities which are a mixture of human and automated activities. For example, the following is a sequence of three activities on a product: sign-off the product (obviously this is carried out by a human), put it into a web-store and announce its availability. At this level of abstraction the business is a systematic set of coordinated activities. We can say that this is the layer where unit managers and super-users work. This layer is usually expressed in BPMN.
• The business routines (or regulations) layer comprises the actions which must be carried out on the business objects to perform the business activities. For example, to announce the availability of a product one has to find out which clients are interested in the product, to collect their contact e-mails and to distribute an announcement. At this level of abstraction the business comprises some modifications (including the adding of value) to the objects. Most enterprise employees work in this layer.
• The business objects layer comprises the many objects specific for a
particular business, e.g. a business partner, a product, etc. This layer hides the complexity for manipulating the objects, which are actually collections of data together with any dependencies between them. The level of abstraction is increased — the business is represented by the objects, irrespective or not of the repositories.
• The business data layer comprises many pieces of information — names, dates, files, etc., which are stored in existing repositories, e.g. databases, document management systems, Web portals, directories, e-mail servers, etc. This layer's role is to access data. In this layer the business is considered in a very primitive way.
The business objects are expressed in XSD in the way to be convenient for _transportation_ information between layers (actually, between services presenting layers), mainly between business execution and business routines layers. You will need to define such XSD as repository/application independent. If you have an enterprise data dictionary then it is a good place to start, but think about transportation of information.
The business data layer is usually very straightforward and is based on existing API for a particular repository. Usually that API uses CRUD pattern.
Interfaces for business objects layer should be more comprehensive than CRUD; for example, some business objects may be versionable and handling of many versions should be exposed. A particular business object many be composed from business data coming from different repositories. Also a particular business object may be stored in several domains (or repositories); for example, one domain (repository) is a master and others are just copies. Something I create explicit business sub-objects, e.g. Customer in CRM (master), Customer in Billing (copy), etc. There are also issues like the following – a modification of a data element has to start a process, e.g. changing of customer’s address may lead to changing his/her insurance contract.
In general, business objects layer is a “virtual data hub” or “master data management facility”, but with the interface better than just CRUD.
Hope this helps.
Thanks,
AS