The Grand Unified File Indexer (GUFI) is a state of the art file/storage system indexing too that offers both user and storage administrator access in a way that each user can only see the metadata for the files they have access to. Imagine having an exabyte of data that is in many file system trees in a trillion files in10 billion directories that uses POSIX permissions (UID/GID/rwxrwxrwx) with inheritance. The user wants to find all the PDF files larger than a gigabyte, older than a month, have the string “green” in the path, have an extended attribute called “carcolor”, have text in the file has the word “pretty” in the same paragraph as “car”. With GUFI the user can find the matching files in seconds and is limited to only what the user is supposed to see.
Recent additions to GUFI include the ability to embed a vector table from text from files with text in them so you can ask natural language questions like “I am interested in pretty green cars” and gufi will get you the most vector similar set of files ordered by best match in seconds. GUFI has so much function, looking at what it is and how it works by looking at a presentation (
MSST-history/2023/LeePresentation.pdf ), or watching a video at https://dl.acm.org/doi/10.5555/3571885.3571960 .
The presentation will concentrate on the many ways GUFI can help your organization manage its holdings and extract the value within them.
GUFI (Grand Unified File Indexer) – What does it have to offer you?
Gary Grider
Los Alamos National Laboratory
Abstract