To understand Microsoft’s document management strategy in Office 365, it’s more instructive to learn about Delve, MS Teams, and Project Cortex than SharePoint.
The move to the cloud had a massive impact on document management, although document management systems (like SharePoint) have changed relatively little.
What has changed is that cloud suites like Office 365 and GSuite have created a much closer relationship between document management systems and the other business systems that contain "unstructured" data such as email systems, file sharing, and IM / Chat . This closer relationship encourages the development of the AI capabilities that the major cloud providers include in their offerings. Project Cortex, which should be available for Office 365 in 2020, is the latest example of an AI feature based on the ability to map the connections between content in the document management system and communication behavior in email and chat.
SharePoint in the on-premise era
In its local days, SharePoint was a typical enterprise document management system in many ways. It was the type of system in which:
It was the type of system that an organization might call a "corporate file system" because documents within the system are likely to have better metadata and are better managed than documents that are kept elsewhere .
SharePoint in the Cloud Era
In Office 365, the role of SharePoint develops differently. Its primary role is to provide document management services (through its document libraries) and (small) data management services (through its lists) for the other applications in the Office 365 family, and especially for MS teams.
SharePoint can still be configured to prompt users to add metadata to documents. However, users have three faster alternatives to transfer a document to a document library:
SharePoint can and should continue to have a logical business structure, but MS teams can begin to reduce the coherence of this structure. Each new team in MS teams must be linked to an Office 365 group. If there is no group for the team, a new group must be created. The creation of a new Office 365 group provides a SharePoint site that stores the documents sent through that team's channels. Each time a private channel is created on this team, a new SharePoint site is created.
SharePoint still has a powerful enterprise search center, but it competes with Delve, a personalized search tool that's in Office 365 but outside of SharePoint. Delve not only searches documents in SharePoint, but also in One Drive for Business and even attachments to emails.
SharePoint can still be configured to apply retention rules to its own content by applying policies to content types or directly to libraries. However, a simpler and more powerful way to apply retention rules to content in SharePoint is provided outside of SharePoint in the retention menu of the Office 365 Security and Compliance Center. This retention menu is equally effective when applying retention rules (through Office 365 retention policies and / or labels) to SharePoint sites and libraries, Exchange email accounts, teams, team chat users, and other aggregations in the Office 365 environment.
Microsoft's Attitude to Metadata
Microsoft Office 365 is a juggernaut. It is an evergreen software that is updated regularly and takes effect immediately. It is facing intense competitive pressure from another giant (Google). It must gain and hold a global mass customer base to achieve the economies of scale on which cloud computing business models depend.
Information architects of one type or another are part of the Office 365 ecosystem. Like any other part of the Office 365 ecosystem, information architects are affected by shifts, progress, and changes in the capabilities of the evergreen, ever-changing Office 365. Suppliers Look for gaps in the offer in the Office 365 ecosystem. They don't know how long a particular gap will exist, but they do know that there will always be a gap as Microsoft tries to meet the needs of a mass market, not the needs of that percentage of the market that is particularly large in strong needs a specific area (governance, information architecture, file management, etc.)
The niche that SharePoint information architects have previously occupied in the Office 365 environment is being changed (but not reduced) by Microsoft's strategy to promote:
Microsoft's need to attract and retain a mass customer base means that they need document management to work without an information architecture specialist because there are not enough information architecture specialists to serve more than a minority of their customers to help.
Microsoft's plans to use SharePoint as the background rather than the foreground element in Office 365 will take some time to complete its course. This gives us time to think about the next gap. What will be the gap for information architects after SharePoint has been reduced to a back-end library and list holder for Teams, Delve, Cortex and Microsoft Graph?
To find a suggested answer to this question, this post looks a little more closely at how and why Microsoft's document management model has changed between standalone on-premise SharePoint and SharePoint Online embedded in Office 365
The model of the on-premise document management system for companies
On-premise document management systems from enterprise to SharePoint on-site were created on the assumption that an enterprise document management system can be separate from the systems (including email systems) that transport documents from person to person.
This assumption was based on the idea that good metadata about a document is entered at the time it is entered into the system and updated with every subsequent revision. This metadata would provide enough context about the documents stored in the system to eliminate the medium or long-term retention of the messages that accompanied these documents when they were sent from the sender to the recipient.
The model relied on very good information architecture to ensure the following:
The problem with this model is that it is not possible to design an information architecture for a company-wide stand-alone document management system that describes documents in such a way that documents are understandable and manageable across all the different activities of an organization. You can do this for some parts of the system, but not for the entire system.
There are two ways to set up an information architecture: top to bottom or bottom to top. Neither of the two approaches works company-wide:
There is a way to solve this information architecture problem. It includes:
This already looks like a chart – like the Facebook social chart that controls search on Facebook, the Google Knowledge chart built into Google Search, and the Microsoft chart that shows the social business chart for Delve represents and Cortex project in Office 365.
Social Graphs of Businesses
A social business graph is an established series of connections between:
Providing a diagram significantly reduces a system's dependency on metadata added by an end user (or machine) at the time a document was uploaded to a system. The mere fact that a particular end user uploaded a document to a specific location in a system already connects that document to the diagram. The diagram connects the document to other people, topics, and entities associated with the person who uploaded the document.
Graphs consist of nodes (people, objects and topics) and edges (the relationships between the nodes).
The concept of the graph has enormous potential in information architecture. You can narrow the range of allowed values for each metadata field for each document that a person contributes to a system by ensuring that the system knows what role they are in at the time the document is uploaded.
This path to intelligent metadata also leads us away from the idea of the document management system as an independent system.
If we see a document management system as a world in itself, we will never be able to collect enough metadata to understand the documents in the system. Better start with the idea that the documents that a person creates are just a manifestation of their work and are related and interdependent with other manifestations of their work, such as their correspondence, their chats, and their contributions to different business databases.
We can also differentiate between a knowledge graph, which is built up from what an organization knows formally, and a social graph, which is built up from the behavior of people in an organization in information systems. The cloud providers initially provided us with a social diagram. Over time, this social diagram can improve to become more like a knowledge diagram, and we'll see below when we look at Project Cortex that Microsoft is taking some steps in that direction. However, there is still a long way to go before the social business chart provided by Microsoft has the precision of an ideal knowledge chart. Notice the word "ideal" in this sentence: I have never worked in an organization that has managed to put a knowledge diagram (as opposed to a social diagram) into operation.
The type of ideal knowledge graph varies from organization to organization. An engineering firm needs a different kind of graphics than a State Department, which needs a different kind of graphics from a bank, etc., etc.
An ideal knowledge graph would combine in an engineering office:
These different data sets and vocabulary can be mapped onto one another independently of a document in a diagram. Once a diagram has been created, a document can be assigned to one of these characteristics and the range of possible values for all other characteristics should be reduced accordingly.
An ideal knowledge graph would connect in a foreign ministry
These can also be assigned independently of documents. Employees can be assigned to the countries in which they are located or to which they follow, or the thematic topic on which they are working.
The concept of a diagram (whether a knowledge diagram or a social diagram or a mixture of both) shows that an organization's data, document and messaging systems are all interdependent. The diagram becomes more powerful from a machine learning and search perspective when it is supplied with the events that take place in different systems. When a person emails a document to another person, it reinforces or recalibrates the perception of the diagram of who that person is working with and on which projects, topics, or topics.
Information architects still have to pay attention to the configuration of certain systems, and corporate document management systems come with more configuration options than any other information system I can think of. However, you should equally pay attention to the configuration of the corporate social graph, to which the document management system, along with the other systems in the organization, will both contribute and derive.
The next section examines why both end users and Microsoft tended to move away from user-added metadata in SharePoint.
SharePoint and users added metadata
In a recent IRMS podcast, Andrew Warland reported that an organization he worked with synchronized their SharePoint document libraries with Explorer, and that most users would prefer to use the & # 39; Explorer View & # 39; 39; than access your SharePoint documents via the browser.
This setting for using Explorer view versus browser view is not intuitive. The browser view offers the full visual experience and the full functionality of SharePoint, while the Explorer view reduces SharePoint to a large shared drive. But it is understandable to think of the relationship between functionality and simplicity. Those who buy and configure information systems typically want to maximize the functionality of the system they buy / implement. Those who use it tend to maximize simplicity. These things are under tension – the more powerful the functionality, the more complex the choices for end users. The simplest thing a document management system has to do is allow users to add and view documents: Explorer view supports these two tasks and nothing else.
At this point I would like to add an important restriction. Andrew has not said that all end users prefer Explorer view. Some areas of the organization had more complex document library settings that they appreciated and were ready to continue adding and using the metadata. However, if the hypothesis put forward at the beginning of this article is correct, it is not possible to configure targeted metadata fields with context-specifically controlled vocabularies for every team in an organization if an independent document management system is introduced.
Graham Snow pointed out in this tweet that a disadvantage of synchronizing document libraries with Explorer is that a user is not prompted to add metadata when adding a document. This raises two questions:
First, let's confirm that metadata is indeed important. To understand a particular version of a particular document, you need to understand three things:
This provides an indication of why many end users do not tend to tag documents with metadata. When a document is shared via email, the end user has the metadata that answers these three important questions in the form of an email in their email account. Your email account will record who you shared it with (the recipient of the email), when (the date of the email) and why (the message in the email). One question we might be asking is why we didn't routinely try to add the details to each document's metadata that we could remove from the email system when it is sent as an attachment. This includes the date the document was sent, the identity of the sender, and the identity (s) of the recipient (s).
The Microsoft Graph
Microsoft is trying to make Office 365 more than just a conglomerate of standalone applications. They are trying to integrate and link One Drive, Outlook, Teams, SharePoint, and Exchange to share experiences with these tools, which they prefer to call separate Office 365 workloads than separate applications. These efforts towards increased integration are based on two main developments of Office 365: an Office 365-wide API (called Microsoft Graph API) and a social enterprise graph (called Microsoft Graph).
The Microsoft Graph API provides a common API for all workloads in Office 365. This allows developers (and Microsoft themselves) to build applications based on content and events that occur in one of the Office 365 workloads.
Microsoft Graph is a social business chart fed by the "signals" of events that occur everywhere in Office 365 (documents that are uploaded to One Drive or SharePoint; documents that are sent through Outlook or Teams; documents edited and commented) liked, read, etc.). These signals are displayed via the Microsoft Graph API.
Microsoft Graph was set up to map the connections between individual employees, the documents with which they interact and the colleagues with whom they interact. For most of its existence, Microsoft Graph was a social graph rather than a knowledge graph.
The upcoming Cortex project (announced at the Microsoft Ignite conference in November 2019) is taking a few steps to turn Microsoft Graph into a knowledge graph. A new class of objects in the diagram is created, known as "knowledge entities". Knowledge entities are the topics and entities that Cortex mentions in the documents and messages that are uploaded / exchanged in Office 365. Cortex creates them in Microsoft Graph and links them to the document in which they are mentioned and the people who work with them.
Applications based on the Microsoft diagram
The top three new services that Microsoft has created since it launched Office 365 are Delve, Microsoft Teams, and Project Cortex. All three services serve as windows for the other Office 365 workloads. They are all based on the Microsoft 365 diagram and provide pointers on how Microsoft Office 365 wants to run and how it is displayed. The future of document management.
MS Teams, Delve, Cortex and Microsoft Graph remove the barriers between the document management system (SharePoint), the file share (One Drive for Business), the email system (Outlook and Exchange) and the chat system (Teams) .
Teams is primarily a chat client. However, it is a chat client in which all documents sent via it are saved:
Delve uses Microsoft Graph to personalize, trim, filter, and rank search results obtained from the Office 365 search engine. Delve transfers these personalized results to individual users so that they can see the following on their individual Delve page:
Delve works under certain conditions. The content of email messages is not searched, only the attachments. It does not recommend documents from someone who does not have access to this document.
In some cases, Delve has had problems with the information architecture. In an IRMS podcast discussion with Andrew Warland (which is being prepared for release), Andrew told me how an organization he came into contact with imported all shared drives into SharePoint without changing access permissions in any way. Each team's shared drive was sent to a dedicated document library. The problem occurred when Delve started recommending documents. Sometimes Delve recommended documents from one part of the team to people in another part of the team, and sometimes the document creators were not pleased that the existence of these documents had been made known to other colleagues.
The team asked Andrew if they could take Delve out. His answer was that they could, but that turning Delve off (or removing the document library from Delve's area) wouldn't solve the root of the problem. The underlying problem was that the entire team had access to the document library where the documents were saved. He suggested that the large document library be divided into smaller document libraries to set access restrictions that were better tailored to the work of different parts of the team.
Delve took small steps to unlock some of the knowledge locked into email systems that is normally only available to individual email account holders (and central compliance teams). Delve cannot search the contents of messages, but it can search email attachments and the metadata of who sent the attachment to whom.
The Cortex project will go one step further. It is a knowledge extraction tool. It tries to identify information in the uploaded documents and messages sent through Office 365. It looks for the "nouns" (think of the nodes in the graphic) in the documents and the news. The types of things to look for are the names of projects, organizations, problems, etc. An attempt is made to create topic maps and topic pages that contain important information about these entities. A link to the topic card appears when the project / organization / problem, etc. is mentioned in an Office 365 workload. Users will come across the link when they read the entity name or type it in an email or document. The topic cards and pages also contain recommendations from Cortex as to which colleagues are experts on the topic and which documents are relevant to the topic. Like Delve, Cortex will use Microsoft Graph to create these recommendations.
The Cortex project is closely linked to SharePoint. The output manifests itself in familiar SharePoint pages and libraries. Cortex takes advantage of the fact that SharePoint sites can be used as an intranet to generate topic pages that work like SharePoint intranet sites. As with SharePoint intranet pages, you can add web parts to them, and they use document libraries to store and view documents. Project Cortex füllt die Dokumentbibliothek einer Themenseite mit den Dokumenten, die es generiert hat, um die Informationen auf der Themenseite zu generieren. Kollegen, die keinen Zugriff auf diese Dokumente haben, haben keinen Zugriff auf die Seite.
Die Themenkarten und Seiten können bearbeitet werden (wie Wikipages). Project Cortex verknüpft die Themenseiten für verwandte Themen zu Knowledge Centern. Diese Wissenszentren werden das Intranet der Organisation ergänzen (oder konkurrieren).
SharePoint und maschinell hinzugefügte Metadaten
Bisher wurden die Aspekte des Knowledge Centers / der Themenseiten von Project Cortex am meisten bekannt gemacht, und diese Aspekte dürften den Endbenutzer am unmittelbarsten beeindrucken. Ich denke und hoffe jedoch, dass die nützlichsten Aspekte von Project Cortex zwei Funktionen sind, mit denen Sie mithilfe von maschinellem Lernen bestimmte Metadatenfelder für eine bestimmte Inhaltsgruppe in bestimmten SharePoint-Dokumentbibliotheken erfassen können.
Das Projekt Cortex wird ein „Content Center“ bereitstellen, in dem Informationsexperten und / oder Fachexperten mithilfe von maschinellem Unterricht bestimmte Modelle für maschinelles Lernen erstellen können. Diese Modelle können in bestimmten SharePoint-Dokumentbibliotheken veröffentlicht werden. Das Modell kann dann Metadatenfelder für Dokumente füllen, die in die Bibliothek hochgeladen wurden.
Nach den Aussagen von Microsoft scheint die Fähigkeit des maschinellen Lernens, die es ausübt, die Stärken von Informationsexperten zu nutzen, da es ihr Wissen über die Geschäftslogik nutzt, die dahinter steckt, welche Metadaten für welche Inhalte benötigt werden. The disadvantage of the machine teaching learning model is that it won’t scale corporate wide. You will have to target what areas you want to develop machine learning models for, just like in the on-premise days when you had to target which areas you would design tailored sites and libraries for.
The developments that are driving change in document management
The following four developments are driving change in document management:
These four developments are interdependent. Machine learning is only as good as the data it is trained on. Within a stand alone document management system there is simply not enough activity around documents for a machine learning tool/search tool to work out which documents are relevant to which people. A machine learning tool/search tool is much more powerful when it can draw on a graph of information that includes not just the content of the documents themselves and its metadata, but also the activity around those documents in email systems and IM/Chat systems.
In their on-premise days Microsoft found it extremely difficult to build shared features between Exchange and SharePoint. Now that both applications are on the cloud, both are within Office 365, both share the same API and both share the same enterprise social graph it is much easier for Microsoft to build applications and features that work with both email and SharePoint.
The gaps that project Cortex may not be able to fill
There are four main gaps in the Office 365 metadata/information architecture model:
These gaps provide the space within which records managers, information architects, and the supplier ecosystem in the records management and information architecture space can act in.
Below are what I see as the medium to long term priorities for information professionals (and the information profession) to work on in relation to Office 365:
So here is my wish list from the supplier ecosystem around Office 365
Sources and further reading/watching/listening
At the time of writing Project Cortex is on private preview. What information is available about it comes from presentations, podcasts, blogposts and webinars given by Microsoft.
On 14 January 2020 the monthly SharePoint Developer/Engineering update community call consisted of a a 45 minute webinar from Naomi Moneypenny (Director of Content Services and Insights ) on Project Cortex. A You Tube video of the call is available at https://www.youtube.com/watch?v=e0NAo6DjisU. The video includes discussion of:
The philosophy behind machine teaching is discussed in this fascinating podcast from Microsoft Research with Dr Patrice Simard (recorded May 2019) https://www.microsoft.com/en-us/research/blog/machine-teaching-with-dr-patrice-simard/
The following resources provide some background to graphs: