Written on 07 December 2018
Structured Search of large data archives
There are many instances where large archives of documents need to be made accessible, perhaps for
- compliance purposes in the financial sectors
- for trawling through document discovery files for solicitors or barristers on cases
- for rapid retrieval of construction details in Operations and Maintenance files for Facilities managers
Our experience has come from this latter example. Typically, a £5m construction development will generate over 1GB of data ranging from As Built drawings to detailed product literature on how to clean the carpets in the executive board room. For large companies with multiple site and large numbers of refurbishment projects, a centralised mechanism for managing the retrieval of all that data is a major headache.
To make the search fast and versatile this data needs to be pre-indexed and organised into sections that allow detailed and specific responses. That index can be designed to cover either multiple projects for a single main contractor or multiple projects for a single client.
Search
Let’s assume that a client has just rung up to say that their fire detector – an ‘2251EM photo-electronic sensor’- is showing a fault. You need to have a look at the product literature so you type in 2251EM and hit return. Instantly you get a couple of hits - note: it is instant because the files have been pre-indexed. One is the word document from your asset list which will include the information on who installed it, and the other is a pdf of the product literature. Click on the later and that document will be downloaded into your browser.
That solves the immediate problem, but maybe you would like to check on some other issues, so you go to Advanced Search.
Now you can narrow your search:
- The list in the first column is all the projects that have been included in this index. By default the search is for all projects, but you can narrow the search to one or two projects simply by clicking on them.
- The second list enables you to select which type of files you wish to include. Eg if it is product literature you are looking for, you only need pdf files.
- The third column enables you to select by folder types. When we set up the manuals, we make sure that all the certificates are contained within a sub folder called ‘certs’, so if you are specifically looking for the certs for a particular project you can click on the other types of docs to exclude them and your search will look like this
Note that ‘red’ means exclude and green means include in the search.
Results
By default all the results will be sorted by project.
Please Note: In the first pass only selected results will be shown. If you click on the …more link at the bottom of each of the project lists, it will return more results for you to review. Only the files that are most close to the search criteria will be listed initially Eg if you search for ‘certs’ from all projects it might not show all those available for a particular project so you will need to select a single project and run the search again.
If you are getting too many results you can narrow the search criteria using such as ‘fire’ but NOT ‘stopping’. This service indexes every word in every digital document in all the manual listed and uses fall Boolean logic to enable complex searches. Best to play with it to see what can be done.
Get Files
The files returned from the search are all downloadable by clicking on them.
- If they are pdf files (extension .pdf) – they will load into your browser
- If they are Word documents (extension .doc) they will download onto your PC and then you have to open them.
- If they are DWG As Built drawings , you will need to have a DWG viewer installed on your PC.
Note that some of these files are pretty large and might take some time to download depending on your internet connection.
- The main volumes of the O&M manual are built as Word docs and are designed to be used with the Navigation Pane to enable easy navigation. Unfortunately the Navigation Pane will not be automatically open when downloaded via the search results, so you should go to the Menu > View – tick the Navigation Pane box.
- All the hyperlinks within the O&M manuals will continue to work – which means that if you are reading the Main Volumes and wish to follow a link, it will download it from the Search database. For security reasons these links are designed to remain active for a short period of time. Some anti virus software may trigger a false positive when seeing these links.
This is a real world example that can be applied to all sorts of circumstances. If you need to be able to get a handle on all the data you are accumulating and make it readily available to specific groups of people within your team - this is a solution worth considering.