Managing Unstructured Data
Josh Bohls
Unstructured data does not fit neatly into databases or systems and therefore managing it can be a challenge. Files such as photos, audio and video recordings, and PDF documents rely on labeling, metadata, and sophisticated search capabilities to make them useful and findable within the organization. This is a fantastic application for machine learning (ML) and artificial intelligence (AI) to conduct complicated tasks like speech and text recognition, for example.
Inkscreen secure content capture solutions help your organization capture, protect, and manage media and documents that employees create in the course of day-to-day business. Due to the unstructured nature of this data, our CAPTOR solutions are engineered to do much of the heavy lifting to enhance the content so that is more valuable and useful. Here are some examples of how we help you manage unstructured data.
1) Speech-to-Text Transcripts. If enabled by policy and permitted by the user, CAPTOR creates transcript of audio recordings. These can be useful if you want to easily extract what was said in a recording, but it is also used to search for media. For example if the system detected someone saying the word “spaghetti” in an audio recording, you would be able to enter “spaghetti” as a search key word and find the recording.
2) Optical Character Recognition (OCR). CAPTOR uses OCR to identify text found in photos and documents. These identified text strings are then searchable. Say for example you are a healthcare worker responsible for taking a photo of each patient’s identification card. You would be able to search for the patient’s name or any other text string that appears on the ID and find the file.
3) Media Info Notes. Every media file (audio, video, or photo) can be notated to add notes or context to the file. For photos, this can also be displayed visually on the caption along with the username, time/date stamp, and location (if enabled). For videos, this can be added to the “final frame” of the video recording. For example, a police officer may add notes to a photo to identified a suspect or the room in a house where the photo was taken. In all cases, these notes are searchable.
4) Location Stamps. If location tracking is enabled by policy and permitted by the end user, CAPTOR will record where media files are captured. This can be represented in City/State/Country format, or GPS latitude/longitude coordinates. The user would also be able to enhance the location by entering a more specific “custom location” value in the media info screen. Future versions of CAPTOR will include a map-based search tool which will organize media by the location in which it was captured.
5) File Nomenclature. CAPTOR enables the IT Admin to configure a customized file nomenclature system so that captured content follows a specific naming convention. For example, the user may be referenced in the file name, or maybe an organization or department (josh_2021-07-2912-23-08.jpg or marcom_021-07-2912-23-08.m4a).
6) Text Label Annotations. Photos can be annotated within CAPTOR to provide context using a variety of annotation tools. Text labels added to photos are searchable.
7) Metadata. The most common and useful way to store labels and contextual information is by writing to the media's metadata. When content is shared outside of CAPTOR the file metadata is updated to include much from items 1-6 (*as allowed by configuration and permitted by user). If you attended the MobileIron Live event in 2016 you may have received a METADATA HEAD t-shirt from us!
All of these features allow organizations using the CAPTOR secure content capture solution to capture, protect, and manage unstructured data in a balanced way that is streamlined for the app user and meeting data protection, compliance, and management policies of the organization.