Here we keep the records of what we will do with Terraphim and Atomic Server.
Webinar “Personalized Semantic Search using the Digital Systems Engineering Process Model” June 28, 2023
Webinar abstract
First, we discover the problem with staffing an engineering project that we can solve using the digital systems engineering process model as a semantic core for the systems engineering body of knowledge. Then, we propose an improved human resource process where we actively leverage existing systems engineering knowledge assets. Then, we extend this idea to the broader context of searching for the right people, not limiting ourselves to hiring and looking for contractors only. We introduce the concept of Pareto-best work-to-skills matching that helps us navigate multiple skills catalogs and unify real-world evidence of candidates’ experience to support their resume claims. Finally, we explain how Terraphim.AI, a privacy-first AI assistant, implements all of the above.
Part 1. The problem with staffing an engineering project
Overview of the problem with staffing an engineering project
The introduction of the digital system engineering process model proposed by the German chapter of INCOSE.
The authors believe that this model could be used as a semantic core for the system engineering body of knowledge.
The idea of building a search and personalization engine based on this semantic core.
The focus is on searching for and identifying people with system engineering skills.
The contrast between the proposed simple process of matching plans with people and the more complex process outlined in the Human Resource Management process of the handbook.
The potential for streamlining human resource management processes for organizations like INCOSE and engineering companies.
The emphasis on privacy and security in the development of an AI assistant.
The authors invite collaboration in building the product to solve these problems.
A few months back, the German chapter of INCOSE delivered us the digital system engineering process model. And they proposed a specific use case for that. But it was the most apparent use case, not the most useful. They suggested the model as a device that helps you navigate the System Engineering Handbook instead of inconveniently looking for necessary information in a paperback version. But when I was looking at it, I could not help but think that this is a semantic core for the system engineering body of knowledge. And what can you do with the semantic core? You can build a search and personalization engine. And we at Applied Knowledge Systems just decided to make that engine. The only question was what this search and recommendation engine should look like and what we are looking for. Which also was obvious. We need to look for system engineering skills and for people who have these skills.
How does one do it? To me, it looks like this. For example, we have a system engineering management plan on the input. It’s a part of project planning. And once we have a plan, we can look for people with the skills we need to execute this plan. It should be as simple as that – plans on the input, people candidates on the output. The reality, though, according to the process model presented in the Human Resource Management process of the handbook, has a cumbersome, very tricky, and very time-consuming structure. It starts with project planning, and as a result, we have a Project Portfolio and Project Human Resources Needs. Then, we identify, develop, acquire, and provide skills. On outputs, we have a human resource management plan, qualified personnel, a human resource management report, and a human resource management record. Nothing that looks like the process I have in mind. It’s time-consuming, and we can eliminate much waste in this area thanks to the systems engineering process model and role-based search capability we built.
If we are successful, INCOSE as an organization or for Indra or other engineering companies that have system engineering as a part of their operations can streamline their human resource management process, and that will not require getting tons of approvals, building connectors, and changing informational security profiles because we built privacy-first AI assistant that does not send your data outside the already approved network or your workstation. I will explain how it works and show you the prototype. My ask from you is simple. So far, we have done all the difficult jobs of building the platform and data processing pipeline. We can now create a product that solves your problems. It’s a fun task you can all participate in.
Part 2. How searching for skills should look
Overview of how searching for skills should look.
The distinction between how CEOs and project managers search for information and the limitations of existing search engines in supporting these distinct roles.
The proposal of a scenario where a machine-generated project description leads to the identification of required skills for a task.
Refining and specifying skills using the WAND taxonomy resulted in a concrete project proposal with specific skills requirements.
Creating a machine-generated resume for potential candidates based on their work products and activities.
The benefits of using generative AI and semantic search to find better candidates and eliminate over-secured searches for overqualified personnel.
The ability to verify each statement in a candidate’s resume with evidence from GitHub, Notion, and other sources and using controlling questions to assess the validity of claims.
The explanation of semantic search is a process of looking for confirmed skills and organizing them into taxonomies.
I want to thank the Spanish chapter of INCOSE, specifically Anabel Fraga, who helped me organize this webinar. I also want to thank Alexander Mikhalev, the CEO and developer of Applied Knowledge Systems, who spent a lot of time building the Terraphim search engine. I am both product manager, head of product, and project manager in this small company, as it usually goes in start-ups. But in corporations, we also execute several roles related to this webinar subject.
When you search for something, you do it differently from the perspective of a CEO or a project manager. If I am a project manager preparing an outline project plan, and I look for some topic, I need a clear and concise explanation that I can use to formulate project tasks that will go well with the project management office. If I am a CEO and use the exact keywords in the query, for example, “Forms of contracting for product companies,” I need practical guidelines and actual contract templates, preferably with comments from the lawyer for a specific jurisdiction. But search engines do not support that role distinction; maybe only Perplexity provides a decent experience. But even with it, you still need to think about how you will search as a CEO or how you will search for something as a project manager, and it just supports further queries when you know what you are looking for and why.
Imagine such a scenario. We have a project plan and a system management plan. I’ll take the actual task description from the Notion workspace of our Innovate UK project, where we have forecasted time sheets, registries, reports, deliverables, and milestones. I put this task description to my assistant. It investigates the system engineering digital process model and hints that, okay, I need project management skills to accomplish that task.
It gives me a description of this skill, so I understand what exactly stands behind it. This is a machine-generated project description of that task. This is not mine, and it’s good. Okay, I read it, and it seems legit. So this is a proposal to collaborate generated from the task, and I respond okay, I approve this skill description for my project. What should be the next step? You should add some supporting skills descriptions because it doesn’t go alone. Project management skills also require risk management, stakeholder management, communication, and problem-solving skills. Again, these are all skills that have been proposed by the machine based on that proposal to collaborate. She generated them, and I said, “Okay, can we be more specific?” And it goes to the curated WAND taxonomy. WAND company has hundreds of them and elaborates on the skills descriptions using these taxonomies. We have a project management taxonomy containing more than 1000 terms. I ask myself – what are the areas of project management I am most interested in to manage this task? It’s not completion, it’s not documents and records, it’s not execution, it’s planning and design. I selected it and realized I don’t need design; I need planning and methodology. Specifically, I need a critical chain method. Using WAND taxonomy, I substituted abstract and broad project management skills with narrow and specific “critical chain” skills.
It does the same with risk quantification, stakeholder analysis, and communication management plans. So now, from a generic task description that is just three lines of text, I have a pretty concrete project proposal with tangible skills. I want a candidate to have all of them. And I say, okay, find me the person that can be a good fit. It goes into my contact list, creates a new page in my Notion workspace for this project, and sends me the link, and I see this resume. Again, this is a machine-generated resume, and it is pretty good. I look at the timestamp of the resume and understand that it was generated one minute ago. The person whose skills are described in this resume has spent not a minute, no time at all, zero effort to write it specifically for this particular task proposal. The machine has generated it and contains actual information because this person has many personal GitHub commits, a lot of Notion documents, and Discord messages. We have a lot of information about this person, and the machine just fetched it and generated a resume from the original working products the candidate has created.
And what are the beauty and benefits of generating proposals to collaborate and generating matching resumes from the actual work products instead of the standard human resources process I described above, with project portfolio and human resource needs, acquiring skills, and qualified personnel at its outputs? Such an approach based on generative AI and semantic search finds better candidates for the job and removes over-secured search for usually overqualified personnel. It also supports each statement in this resume, and it is not just lines of text but a string. Every sentence is a link, and I can verify the evidence in GitHub in Notion to check how valid each statement is in the resume. Is it a real thing or not? If unsure, I can ask, “Please generate a controlling question for this statement.” So if the person says that she made realistic project plans, okay, the question would be, “How did you make sure your project plan is realistic?” That’s a legitimate question. Or you can ask, “How did you attend to the project’s performance baseline?” The project performance baseline should be completed so that how it works. That’s how I can arrange the assessment of candidates.
But what is the semantic search? How all of the above is related to the topic of the webinar? As you can see, we are looking not for strings but for the resumes matching keywords. We are looking for confirmed skills the candidates obtained. And when I am looking for skills, I am looking for things. And that is a definition of semantic search. What’s important here is that when you have items as a result of your search, you can organize them into taxonomies.
Part 3. Context of searching for skills
Overview of the context of searching for skills
The explanation of taxonomies and their use in enhancing the search experience draws parallels with online shopping websites like Amazon.
The observation is that current talent search platforms, like LinkedIn, often lack structured taxonomies for skills, making skill-based searches challenging.
The proposal is to use a process model, such as the systems engineering digital process model, as a proxy for skills taxonomy to improve the relevance of skills searches.
The importance of efficient skill searching in various organizational contexts, including team formation, project staffing, stakeholder engagement, and customer meetings.
The challenges of quickly finding the right people with the required skills for numerous tasks in complex projects, especially when dealing with novel technologies or high personnel turnover.
The cost and time implications of traditional skill searching methods, such as meetings and new hires, and the need for a more efficient approach.
The mention of a proposed experiment involving LinkedIn profiles to measure the difficulty of finding the right people based on skills and experience.
The introduction of Applied Knowledge Systems aims to develop a privacy-first personal AI assistant to assist users in finding context-specific information and recommendations.
The concern over the degradation of search quality and loss of privacy in existing search solutions and the motivation to build a more controlled and tailored AI assistant, Terraphim.
Acknowledging the limitations of mass-market solutions offered by big companies and the need for industry-specific, customized solutions that efficiently address niche market needs.
That’s quite unusual turn, one can say. What’s a taxonomy? If you go to the Amazon website, you see them on the left sidebar. If you are looking for shoes, you will see shoe taxonomies on the left side, being that taxonomy of shoe color and sizes, brands, prices, design, and purpose. It’s all organized in these hierarchical structures. Such structures are taxonomies. They significantly improve the search experience, as we know from online stores.
Do we use taxonomies when we search for talent? Not very much. Take LinkedIn, for example, where skills are listed in a plain list; you scroll them down, and no structure helps you navigate those skills. This is a consequence of not grounding skill descriptions in real-world persons. On LinkedIn, you search for strings, which are very difficult to classify, and put a taxonomy structure on them. SFIA framework or professional standards like systems engineering competency framework provide more guidance, but real professional profiles usually have more specific names for skills that are not always aligned with the taxonomies described in the standard. And anyway, a significant share of companies use plain lists to manage their skills. But we have a prominent workaround for this gap – we can use the process model as a proxy for skills.
Systems engineering has a process model that can be used as a proxy for the skills taxonomy. This model we have in the machine-readable form. We built a demonstration that uses the system engineering digital process model to augment the systems engineering skills search. The model contains four processes: technical, technical management, organizational project-enabling, and agreement processes. Each process has activities, and obviously, each activity and process requires relevant skills. It is a hierarchical structure aligned very well with the systems engineering competency model as a reference catalog of professional skills. As I stated in the beginning, all the terms from this model comprise the semantic core of the systems engineering domain. That’s a sound semantic model that can augment searching for skills. Let’s return to the use case I started this talk with.
Semantic search is straightforward. When you do a semantic search, the sequence of words makes very much difference. The “car factory” query produces very different results than the “factory car” query. In semantic search, we look for things, not for strings. And in systems engineering, we have many such term sequences that have precise meanings so that we can search for many relevant things for this domain.
If we use a systems engineering management plan as an input to the semantic search of the systems engineering skills, then such taxonomy could prove helpful, increasing the skills search relevance. The task of searching for skills is often overlooked. For example, the famous teambuilding framework introduces five stages in the team life cycle – forming, storming, norming, performing, and adjourning. But before creating any team, you must discover who should be in it. You need to be able to search for relevant skills, generate proposals to collaborate, and, finally, onboard team members. The project manager, the chief systems engineer, or the technical leader should bring them together first. Also, there is more to staffing an organization, as one should bring team members together and stakeholders.
This task can prove difficult, as once the project proposal gets approved, the timeline to arrange everything, find the right people, build up connections with stakeholders, and establish proper organizational interfaces is usually tight. When they give you the money, they want results fast. That’s the way it works. And people are not always at our immediate disposal, so you need to find them quickly. The average project has hundreds of tasks, each requiring several skills, so we are talking of hundreds of the skills necessary to complete the project. A qualified system developer or technical leader has over a hundred skills, with over a dozen active and 60 to 80 passive skills.
It’s impossible to know what everyone on your team does and find the best match quickly for all the tasks, issues, risk containment plans, audits, and gate reviews, especially for projects using novel technologies or during the high churn of personnel. If you look into a typical task tracker, you will find a significant portion (from my experience, I’d say at least 15%) of tasks and issues without the responsible person assigned. Also, you will find that many duties are given to the same very qualified engineers, leaving them overloaded and being the bottleneck, and some others sometimes without much to do in the downstream development processes, especially at the beginning of the project. Finding proper and economically viable skills in the engineering organization is not a trivial task. We often tend to lean on a few highly qualified and well-compensated people.
But the problem extends far beyond just staffing the project. What if we need to find the right participants for the customer meeting? Or what if we hire a new system developer and are required to gather the interview panel and define the onboarding sequence? How do we find the people who are best fit for this job and have enough time? Because the costs of meetings and new hires with traditional means are staggering and are often well over a few thousand dollars. Is the agenda for such a meeting aligned with the competence profiles of the participants? Are they competent to make the decision? One of my favorite examples is responding to a request for a proposal or starting a new project. We usually have a very tight response timeline when we get an RFP. At best, it is a couple of weeks. That means you must find all the people you need to prepare the response in two days.
To quantify the problem of skill search, I propose you perform a small experiment with your LinkedIn profile. LinkedIn has a feature for measuring your social selling index, the SSI. This index has four components. If you are close to an average user, the lowest value components in your profile would be “Find the right people” and “Engage with insights.” It is the hardest thing for all of us. In my professional network, there are more than a thousand people, with most of them. I rarely communicate and would appreciate assistance learning their skills and experience better when I contact them.
That is precisely what we are building in Applied Knowledge Systems – an action-centric, privacy-first personal AI assistant that can fetch you proper context depending on your actions. You can feed it a project plan from your Notion, a product roadmap from GitHub, or a meeting invitation in Gmail, and it will generate your messages with invitations to collaborate and recommend the right people from your contacts list. Such a solution does not exist right now.
Also, we all have been facing search quality degradation for the last few years and the loss of privacy, which is of high concern to us. That’s why we are building Terraphim as an AI assistant that is entirely under your control. And here, we cannot rely on companies that ignored the privacy and degrading search for years to deliver a proper solution. Also, big companies usually build mass-market solutions that do not account for industry-specific needs and aspects. This is because niche market segments do not justify their development costs, and the users will never provide the company-specific data required to customize the solution. All tailoring that big companies can offer us comes in the form of prompt engineering, which, more often than not, is not sufficient and too time-consuming for customization.
Part 4. Pareto-best work-to-skills matching
Overview of Pareto-best work-to-skills matching.
The problem of mismatched work-to-skills matching in the workplace and its impact on job satisfaction, as highlighted by Gallup reports.
The concept of Pareto-best competence profiles, where candidates may excel in some critical skills while having secondary skills, and the need to optimize the search process to match as many relevant skills as possible.
The limitations of traditional search technology in ranking skills and candidates based on their proficiency in specific skills and how Terraphim can address these limitations.
The importance of high-quality data in engineering projects and its direct correlation with the effectiveness of chatbots and AI assistants in answering questions and providing relevant information.
The flexibility and ease of incremental training and language development with Terraphim allow users to customize their interactions with the AI assistant without requiring lengthy implementations or complex query languages.
Mapping skills descriptions to widely accepted standards, such as the SFIA framework and WAND taxonomies, enhances skill profile interoperability and provides a controlled vocabulary for AI interactions.
The ability to generate formal activity-centric models from standardized skill frameworks makes it easier for AI assistants to ask interview questions and match skills with user responses.
Using controlled semantics and vocabulary in skill standards ensures accurate tagging and categorization of skills within formal models, improving the reliability of AI-driven skill assessments and searches.
What can we expect to see if we successfully resolve the problem? According to Gallup reports, many people are miserable at work because they do not use their abilities to the full extent or are overloaded, precisely two things that have the exact root cause of the biased work-to-skills matching process. The numbers of workplace dissatisfaction are staggering. If we can match work to skills better, we will see an improvement in workplace satisfaction as most occupied professionals will be less busy, and people who are not using their talents to the full extent will apply more of their skills.
As I mentioned above, a task or a set of tasks usually requires several skills, so we are looking for multiple skills matches, and most of the candidates cannot be equally good at all of them, which still should be fine in most cases. For example, I am pretty good at requirements engineering and systems architecture, though my major is lifecycle and configuration management. Still, I can be a good fit for some tasks where my level of requirements engineering is sufficient, though many people know it much better than me. However, they do not know configuration management, which is also required to complete the job. Such competence profiles are called “Pareto-best.” We optimize the search to match as many skills as possible, giving the essential skills more weight than the secondary ones. Again, if you use the traditional search, such an approach is impossible. All keywords in the prompt have the same importance, or you can have mandatory keywords, selected with prompt syntax, and others that the search engine can completely omit, reducing the search relevance. You cannot find and properly rank persons with requirement engineering, systems architecture, and configuration management skills. However, showing more competence in configuration management first and filtering out those who do not know requirement engineering. The search prompt syntax doesn’t allow that. You cannot implement Pareto-best skills search with traditional technology, but you can do that well in Terraphim.
The textbook quality of our data in engineering projects makes it much more attractive to apply to engineering companies. There is a direct relationship between the dataset quality and the quality of the chatbots. You can check how well chatGPT answers systems engineering questions and compare that with its performance in marketing or product management fields. Answering systems engineering questions will be much more reliable because the training data are so good. The only problem is that these data are distributed across multiple sources, and we need either federated search or robust, reliable, and private interoperability between data sources.
Another aspect of the proposed solution is improving the search results’ relevance in small bits. If you have 15 minutes, you can train Terraphim to understand you better. You don’t need a month-long implementation to build the next increment. You develop your language with which you talk to an AI assistant, not studying proper prompt engineering as you need with commercial LLMs, which use their dialect and will not understand your terms. This is just how they are built. Such lengthy implementations and complicated query languages are not aligned with our workplace reality.
Also, you want to exchange information about skill profiles. You need to map skills descriptions to some widely accepted standards to do that. In this demonstration, we use the SFIA framework, The global skills and competency framework for a digital world, and a few WAND taxonomies (project management, for example) to enrich the original systems engineering process model with more specifics and provide interoperability. The SFIA framework works miracles with chatbots. Try the Bing chat or chatGPT, and ask them about your job experience. For example, copy and paste a paragraph from your resume and ask what SFIA skills you have based on the text, and it will provide a perfect starting point.
As all these standards for skills use controlled semantics and vocabulary, it is straightforward to produce formal activity-centric models from them with different tools. We apply Apollo 4D activity modeler because it implements the best practices introduced and developed by Mathew West, and chatbots can easily fetch resulting models. Of course, you can use machine-readable versions of the Object Management Group standards or WAND taxonomies without spending any time on serializing text descriptions of skills into proper JSON. Once you build a formal model for some skill, for example, for the SFIA Project Management, Level 4, shown in the illustration, you can ingest it into the AI assistant, and it can generate dozens of interview questions on project management. For example, it can ask a user, “What Project Management Tools did you use to Manage Risks when you were working at Company X in 2019?” It takes the formal skill model and the user’s resume. It generates a proper question, tagging it with domain terms – Project Management (SFIA skill), Project Management Tools (SFIA concept), and Manage Risks (SFIA concept). After it gets the user’s response, it transcribes it and tags it with the same domain terms. So, when you search for these keywords, you will get a match for confirmed skills correctly placed under the SFIA skills taxonomy.
Part 5. Role-based semantic skills search
Overview of role-based semantic skills search.
Using a semantic model that combines formal structures, taxonomies, and ontologies with natural language responses from subject matter experts to enhance search relevance.
The tagging of content with formal terms from various taxonomies and controlled vocabularies to improve search relevance and discoverability without significant effort.
The challenge of mixed relevancy ranking is due to the use of different taxonomies and preferences by individuals in various roles within a project.
Implementing a role-based search mechanism in the Terraphim AI assistant to cater to different job roles and provide varying recommendations based on parts.
The explanation of the Terraphim role includes distinct data sources, hashtags, objects, role-specific activities, preferences in prompts, and search results history for each position.
The importance of fine-tuning the assistant relevancy function by allowing users to approve or ignore search results, providing a more customized experience.
The practical applications of Terraphim in streamlining project staffing by automating processes using controlled dictionaries and curated taxonomies from industry standards.
The invitation to visit the website and participate in testing the early version of Terraphim aims to build a privacy-first AI assistant tailored to users’ daily workflows.
So, on one side of such a semantic model, you will have formal structures, taxonomies, and ontologies. Conversely, it will be a natural language response from subject matter experts. You can keep both particular professional lexicons in your response to the AI-generated question, which will still be searchable and discoverable because it is mapped into the controlled vocabulary of the systems engineering digital process model, WAND taxonomies, or the SFIA terms. AI assistant takes natural language input and tags it to skills taxonomies, which gives you the power of formal models for search but doesn’t require much effort to build such models.
But we went even further. Tagging the content significantly improves search relevance, but you will use several thousand formal terms from about 15-20 taxonomies in the average project. Every piece of content will be tagged with different words from different domains. For example, the same project proposal will be classified at least from the project management, engineering, finance controlling, and corporate compliance. Even for a small project with a team of five people, we will have mixed relevancy ranking just because the project manager, developer, and finance controller will have different preferences for how to rank search results. Different roles imply other relevancy rankings and different taxonomies to augment the model. With roles-based search, for the same search query, project managers will find one person and engineer another because they do other jobs. Thus, with an action-centric approach, they must benefit from varying recommendations.
We implemented the role relevancy mechanism in the Terraphim AI assistant. Please see this demo of the changing search relevance once the user selects different roles. Terraphim performs a two-step search – first, we define domain relevance using keywords, and then we define further actions relevancy using a role-based search mechanism. It may sound vague and complicated, but all these improvements come from your note-taking application: Notion, Fibery, Logseq, or Obsidian. We also can process GitHub repositories or Jira issue trackers. You don’t need to do much on top of what you are most likely already doing as part of your daily routine.
The Terraphim role combines three things: each has its own set of data sources, hashtags, objects, and role-specific activities, and, finally, preferences in prompts and history of search results. Let’s dive into them. The combination of data sources means that I am in the father’s role; considering the class’ WhatsApp chat and educational platform as relevant sources of information and the project workspace as irrelevant, I don’t want to get any results, even well-fit, from my workspaces. And vice versa, if I am searching for something from the project manager position, I don’t want to get the results from my personal e-mail or family photos folder. Suppose I select the role of a systems engineer. In that case, I have specific high-priority hashtags such as life cycle model, system configuration, or concept of operations. Still, for example, a project’s budget has low relevancy in my search. The situation changed when I switched to the project manager role – project budget became a highly relevant hashtag, as well as work breakdown structure or deliverables, but system requirements do not matter for me a lot, and that affects search results ranking. And finally, the user can approve or ignore search results, fine-tuning the assistant’s relevancy function.
And that’s it. This is the first Terraphim use case we are implementing now. We are streamlining project staffing with engineers by automating steps in between using a controlled dictionary and curated taxonomies we ingest from industry standards and the catalog of WAND company, and we are using the roles mechanism to augment the search relevancy. Thank you very much for your time; please visit our website and participate in testing the early version of the product. Please help us build your personal privacy-first AI assistant that fits your daily workflow.
While working on the last project, I discovered a new use case for a joint use case with Atomic Server. When working with business intelligence and reporting systems and using particular templates, reports, or dashboards, you select some concepts relevant to your active role - like accounts, customers, or transactions. You can build an Atomic Server integration layer to help you collect your role-specific data into the personal repository and use it for semantic search. It looks like a straightforward implementation.
Atomic Server appears to be almost perfect for domain-driven development and rapid prototyping of process automation. We’re making a small project now as a proof-of-concept for that approach.
There is a clear path from the model-based concept development of an IT system to the domain-driven development using Atomic Server, especially if we identify curated domain taxonomies from the user interview, and use rapid prototyping techniques.
Terraphim searches over local markdown files using ripgrep command.
It searchers over local md files(haystack) and maps matched md files into struct Article. To present results to the user in UI, I must create a hashmap (article_id): Article with a tag corresponding to the matched concept (from logseq).
I want to save this hashmap as a collection of articles in atomic using rust atomic-lib, where each collection is named after the terraphim role. What it will allow me to do is not only visualize results but also replace the popup modal or link to the URL with a link to the atomic server document editor for the articles.
So, as a user, I will be able to search over my local markdown files, automatically populate the atomic server with search results, and, once the article is ready, make it public.
For example, I have a set of articles tagged with #learning-rust. I can search over tags, create a collection, and then make good once it is public. Then, we only need to add a sveltekit route for it to become available on the learning-rust.org domain. We plan to populate domain systems.tf with our work.
Example of working code:
A domain-driven development simulator is a promising area of application for Atomic Server and Terraphim. You watch any game simulator (it can be a very simple one for phones, like airport management, or pretty complicated, like the Soviet Republic or Dyson Sphere Program), and then model the game domain in the Atomic instance, using domain taxonomies. Each game bit becomes an exciting and challenging exercise to build a domain model, classes, and ontology. I am waiting to complete the current project, so I am free to do that in my spare time.
The Innovate UK use case “Generate proposal to collaborate from the project scope” is a promising one. When you are delivering some complex project, you often need to identify what skills your team is lacking and find a good contractor to fill that gap. It takes some time to understand what those skills are exactly if you don’t follow the proper procedure that is supported by a good tool. That’s why it is promising - in small start-ups everyone should be very effective.
Interesting source:
The near-term impact of AI on the cyber threat - NCSC.GOV.UK
- Artificial intelligence (AI) will almost certainly increase the volume and heighten the impact of cyber attacks over the next two years. However, the impact on the cyber threat will be uneven (see table 1).
There are more reasons always to use personal AI assistants to follow security protocols. Otherwise, any slip in security measures can lead to security breaches, loss of money, identity theft, and other dire consequences.
Over and over again, knowledge management and knowledge transfer between generations of engineers surfaces in discussions—this time, in the oil and gas industry. I remember an engineering archaeology case.
I’ll just put it here, anyway; too often, I need to refer to this text.
INSTITUTIONAL MEMORY AND REVERSE SMUGGLING
Institutional memory comes in two forms: people and documentation. People remember how things work and why. Sometimes, they write it down and store that information somewhere. Institutional amnesia works similarly. The people leave and the documents disappear, rot, or just become forgotten (as it were).
I worked for several decades at a large petrochemical company. In the early 1980s, we designed and built a plant that refines some hydrocarbon-type stuff into other hydrocarbon-type stuff. Over the next thirty years, the institutional memory of this plant faded to a dim recollection. Oh, it still operates and still makes money for the firm. Day-to-day maintenance is performed, and the skilled local crew is familiar with the controls, valves, safety systems, etc.
But the company has forgotten how it works.
A few things conspired to make this happen:
• The downturn in the oil industry through the 1980s and 1990s caused a moratorium on new hires. By the late 1990s, our group was a mix of people over 55 and under 35, with few in between.
• We gradually made the move to fully computer-based design.
• A series of group reorganizations physically moved our office several times.
• A major corporate merger several years after that completely dissolved us into a larger petrochemical firm, causing a significant institutional and personnel shakeup.
Institutional archaeology
In the early 2000s, several of my colleagues and I retired.
In the late 2000s, the company remembered that this plant existed and thought about doing something with it. Specifically, increase output by debottlenecking one unit and doing a feasibility study on the addition of a second unit.
Now they had a problem. How was it built? Why was it built like that? How does it work?
Institutional memory grows hazy at this point. The alien machinery hums along, producing polymers. The company knows how to service it but isn’t quite sure what arcane magic was employed in its construction. In fact, nobody is even sure how to start investigating.
It falls to some of the then-younger engineers, now the senior cohort, to dig up documentation. This is less like institutional memory and more like institutional archaeology. Nobody has any idea what documentation exists on this plant, if any, and if it exists, where it is, or what form it might take. It was designed by a group that no longer exists, in a company that has since merged, in an office that has been closed, using non-digital methods that are no longer employed.
The first step is finding out what the plant’s name is. It turns out that most engineers use a colloquial name based on its location, and it has another official name. Several of them, even. There is the name of the internal project that designed it, and the name of the joint venture under which it was built.
A unique ID was assigned in 1998 as part of a document-management revamp. There is another unique ID that was assigned in 2001 for digitization purposes. Incidentally, it’s not entirely clear which document management systems are current. Also, some of them point to other document-management systems.
No luck here. The 1998 ID points to documents located in a “library” at an address that hasn’t existed since long before 1998. This might explain why the 2001 ID doesn’t point to any digitized documents older than some recent reports on routine maintenance. At the time, I had naively hoped digitization would solve our problems forever. My manager was reading a dense book about it that I picked up out of curiosity. It had seemed persuasive.
But the old-fashioned phone and email tree worked a bit better. The old research division remains mostly intact, and its physical library exists. Someone there can find documentation on the plant’s polymer processes and copies of some engineering documents duplicated for the R&D library’s local records. Big paper blueprints and engineering drawings, as well as books of data, in dusty filing cabinets. The paper documents tauntingly sport IDs announcing that Big Digitization Corp had digitized them at some point. Who knows what happened to that archive?
Deciphering documentation
Some documents are assembled, and the engineers get to work trying to get a handle on organising a debottlenecking project. Unfortunately, the documents seem to be written partially in hieroglyphics and are only partly complete. They make some plodding progress. The manager half-jokes that engineering schools should teach a course in engineering archaeology, where students are given a pile of 30-year-old documents and asked to figure out what’s going on. I like the idea. Maybe even read an old engineering textbook into repairing old vacuum-tube electronics, like the collectors.
Some methods and notation are familiar, but others are long obsolete. Even where nothing has officially changed, cultural assumptions about what should be documented explicitly or can be assumed have changed, making interpretation difficult. And it would be nice to have a big-picture overview book. At the end of the project, someone should’ve been commissioned to write a book, “What This Goddamn Plant Is and How It Works”. That book is effectively being written now only by archaeologists.
Reverse corporate espionage
A former colleague and I were contacted sometime after this by another former colleague who now had a management role in this group. Would we be amenable to consulting part-time on a project relating to the old Plantname? I agreed. It sounded interesting, and I was offered an hourly rate several times my previous salary.
Thus, I landed the strange job of explaining how its plant worked to the company.
I could draw on several kinds of personal memory for this job. I remembered how some things worked; the 30-year-ancient engineering practices were my own. More importantly, I knew what was important and how the pieces fit together.
Perhaps equally importantly, I unofficially had some documentation. During our office moves and reorganizations, the document situation became increasingly dire. I would wait days to get something mailed to me after tracking down a series of merged document libraries, some of which were halfway through the digitization processes. Paranoid corporate management also had rules about anything relating to trade secrets, which meant anything relating to the polymer process, making it hard to work while visiting contractors’ offices.
So, we developed a don’t-ask/don’t-tell policy of making private copies of documents and carrying them around with us. Engineers, to generalize, hate waiting around for stupid reasons, and having documents meant that we could get to work. It also made us look better since we got things done on time instead of having to send out lame excuses that we were late because we were waiting on a fax.
My job now was to smuggle these documents back into the company. I would be happy to hand them over. But that doesn’t make any sense to the company. The company officially has these documents (digitally managed!), and officially I don’t. The situation is the reverse, but who wants to hear that? God knows what official process would let me fix that.
No, the documents must be returned to where they ‘already were’ unofficially. Physical copies are made and added to the local group library. Eventually, they’ll probably work their way into the digital document management system the next time someone canvasses and notices some documents with no inventory control tags. I hope they aren’t lost this time because I won’t be around in another 30 years to smuggle them back in again.
Oh, and as an external consultant, I’m not allowed to know some of the trade secrets in the documents. The internal side of the team needs to handle the sensitive process information and be careful about how that information crosses boundaries when talking to external consultants. Unfortunately, the internal team doesn’t know the secrets, while I do. I even invented a few of them and have my name on related patents. Nonetheless, I must smuggle these trade secrets into the company so the internal side can handle them. They have to ensure they don’t accidentally get them from me.
We hear a lot about the spy-movie kind of corporate espionage. I’d love to read a study of reverse corporate espionage, where companies forget their secrets and employees must get them back unofficially. I’m convinced it happens more than you’d think.
A solvable problem?
I’m not sure what the moral of this story is.
Better organization and document management could solve some of the problems. However, attempts to fix corporate document management also cause some issues, so one has to be careful. We might’ve had better luck if more of the physical office libraries still existed. We only retained some of the documents because one of them did.
A memory of techniques and importance is even more challenging. Maintaining a continuous gradient of ages in the company probably helps you not fall off a memory cliff when one cohort of employees retires.
But maybe engineering archaeology will always exist. The more I look around, the more the engineering world looks like underground New York City once you go back more than a few years. A mass of strange engineering feats humming away from sight, produced by long-forgotten ancient peoples, leaving only fragmentary maps and diagrams.
—An engineer, 2011-12-04
Prepare your data for searches in Copilot for Microsoft 365 - Training | Microsoft Learn
Tips and advice on how to prepare the content:
Clean out redundant, outdated, and trivial (ROT) content . Perform an extensive audit of all organizational content including documents, emails, chats, wikis. Remove any outdated materials that are no longer accurate or relevant. For example, delete old product spec sheets from 5+ years ago, promotional emails for expired campaigns, and resolved IT ticket conversations.
I am not sure about that. There are a lot of cases when you would need them, especially when you need to recover the context of some design or organisational decision or otherwise recover the history lesson. Cleaning should not be removing and deleting; it should be versioned and stored, prioritised in search results, but never “cleaned”.
The domain-driven technique of meaningful conversations.
When you hear a long and complex question or reply in a meaningful discussion, you should always check to see if your understanding is correct. You need to reiterate it in your own words. However, you cannot spend too much time on it; you should simplify and extract a scheme of the argument. And it is always the same:
- Facts
- Logic
- Deductions
- The ontology and epistemology behind the logic and facts - how do you obtain the facts and come up with logic, and what are the limits of those facts and use logic?
No fact and no logic are universal, so before rebutting the arguments, you should align domains and ensure a proper understanding of facts, logic, and presuppositions behind the logic.
This gives you the reiteration schema: “Did I understand you correctly? You were talking about those facts, and you used the following logic to reach this conclusion.” If you agree with that, you can build your counter-argument by questioning facts (are all relevant included, maybe something is missing, or vice versa, is an extra); you can go for logic, which may contain weak links in it, or, finally, you can go for ontological and epistemological grounds - after all, how all those facts been established and how sound the logic is?
You listen carefully to what other people say, you populate this schema in a notebook, and then prioritise the main point you go for. After that, it is a straightforward version of Minto’s pyramid of the central thesis/counterthesis and arguments. It’s like a musical improvisation, simple from one side but fun and complex to execute.
This is a knowledge management process model from the systems engineering handbook. It’s interesting that knowledge management process outputs are not used anywhere; they are a dead end. No wonder it is so difficult to sell such solutions—there is no consensus on how to use process outputs. Shouldn’t we come up with some proposals on the subject?
First, the INCOSE knowledge management workgroup has scheduled my report on the Terraphim approach to domain knowledge management in SE processes for 19th April last week. I will apply John’s Fitch decision patterns to the project/program stage-gate model. Knowledge is what helps make decisions, and knowledge increment is what decreases the uncertainty (not ambiguity, see Policy in 500 words: uncertainty versus ambiguity | Paul Cairney: Politics & Public Policy to understand the difference).
Second, I conducted an introductory session for La Universidad Autónoma de Bucaramanga on the Terraphim showcase and demo website. In this case, I will explain how new roles emerge and are configured in Terraphim. How do we go from contract negotiations and terms to engineered implementation descriptions and terms?
The OODA loop in systems thinking: the “O”
- Victor Vakhstein, in his book “Imagining a City,” described a case from his early days in psychiatry practice, where he and his supervisor interviewed a person with a mental health condition. When they compared the notes, he discovered a fantastic thing. His notes were direct citations, a transcript of what the patient said, but the doctor’s notes were translated into the domain language, using only professional terms. This is what we often discover when we go to a medic. We explain what we feel in plain language, but we never find exactly what we say in our medical chart. There will be only medical terms. Professionals see the world through their domain lenses, a deciphering grid, and encounter a completely different picture than other people who do not have such knowledge.
- In science and technology studies, this is called “translation.” One narrative is translated into another using controlled vocabulary and logic. Usually, such concise translation is a compact schema. A typical example would be a dialogue between professionals in front of a flip chart, where a two-hour discussion is expressed on a one-page diagram with notes. Or just a 20-page contract that represents the results of two months of tight negotiations with hundreds of pages of transcripts for a software development project. Or, a fancy example of a large language model that represents a vast text corpus with a reasonably compact transformer model. One fundamental feature of such compact representations is that those translations are, to some extent, reversible. What one expert has written, another can read. Of course, there is a phenomenon of tacit, non-formal knowledge, but in most cases, we read contracts, drawings, and reports without problems.
- A communication channel consisting of a transmitter, channel, receiver, feedback and noise works well. For that, you need competent experts capable of correctly coding and decoding the domain terms and correct interpretation of the objects, situations, and patterns on both ends of the communication channel for it to function correctly. And here lies a problem. Systems engineering and project management are complicated disciplines requiring much real-world experience. Such real-world experience produces a grand variety of terms used. An experienced professional can easily recognise such synonyms, find the proper concepts from the discipline, and codify or interpret the message using his professional process model; in our case, it is the SE Handbook.
- What you see defines how you act. If you see a specific symptom, there is a prescribed way to treat it. If you see a system requirement or system operations report, you refer to the SE handbook and act according to the process model. A lot of ambiguity in situations disappears when we know who we are, what role we play, and what we see is exactly what we see. The better both parts are prepared, the more efficient communication and collaboration are.
- Recognising objects is part of the observation step in the OODA loop. This essential skill is rarely trained by itself, which is an opportunity to improve the practical education of engineers. I use Grammarly, which helps me tremendously to master my writing skills in English, correcting my choice of words, grammar, and punctuation. The same can be done with systems engineering skills; we should be trained to see the domain concepts from the systems engineering process model and the handbook in real life. This is what we are building now in Terraphim - activating knowledge, not only managing it. Stay tuned.
[[Terraphim]] engineering checklist question generation
- First of all, I conceptualise checklists as a set of competency questions that help infer the situation in an engineering project or a system’s operation.
- people.cs.uct.ac.za/~mkeet/files/CLaROv1DemoSmall.mp4
- This demo shows the semantic patterns for 56 competency questions and the method of specifying those questions with domain terms and activities. mkeet/CLaRO: Competency question Language for specifying Requirements for an Ontology (github.com)
- More on the subject [1811.09529] Competency Questions and SPARQL-OWL Queries Dataset and Analysis (arxiv.org), kgswc23agoqs.pdf (meteck.org) and Steps toward automatically generating competency questions for ontologies | Keet blog (wordpress.com)
The Atomic Server Ontology Creation feature a 3-minute video tutorial
Creating an ontology using AtomicServer (loom.com)
A new review from Yannic has cleared the way I think about one of the use cases:
- We find relevant sources with Terraphim.
- We tag them appropriately to the selected Terraphim role.
- We can add additional tags and structures using the Terraphim method https://systems.tf/article/New-Article-c77immvoso8/article/New-Article
- We pack it all together in an atomic resource, again, using the example above for the Terraphim method.
- We add it as an example of relevant information into retrieval augmented generation.
Thus, we have a textbook-quality RAG that we can produce from conversations or unprocessed texts. Directly plays into our Innovate UK use case for matching between a project description and a CV or GitHub repository of a collaborator.

