CASE STUDY

Surfing the big data wave

Efficient data compression and visualization software for multibeam echosounders

Water column data acquired by multibeam echosounders (MBES) imposes large requirements on disk storage and data transfer, so is typically logged only during specific times, a practice that poses the risk of missing interesting targets. Furthermore, huge data volumes from both bathymetry and water column data can lead to huge burdens for the operators during long surveys. MBES data is often compressed using standard solutions such as Zip or 7-Zip, but these can be computationally heavy for a relatively modest size reduction. To overcome this, we developed FAPEC, a high-performance data compression software, now supporting MBES data. We also present FARSHY, a fast visualization and analysis tool to streamline quick checks on the heavy water column files.

Space technology for marine echosounders

DAPCOM’s FAPEC data compression software was originally designed for satellites such as ESA’s Gaia, the billion-star surveyor, where onboard computing, storage and downlink capabilities are extremely limited. Later, FAPEC was enhanced with improved performance and additional algorithms to better adapt to a wide variety of file formats and data characteristics. In collaboration with Kongsberg Discovery and the Marine Geosciences Research Group of the University of Barcelona (UB), FAPEC was adapted to accommodate the .all, .wcd, .kmall and .kmwcd (KMall) formats from Kongsberg’s EM MBES systems, and more recently it has been integrated into Kongsberg’s Seafloor Information System (SIS) to provide automated file compression once the logging files have closed. FAPEC is being further extended to other formats and vendors.

FAPEC runs on Windows (including a graphical user interface, WinFAPEC), as well as on macOS and Linux, and it supports ARM processors. Its C, Python (fapyc package) and Java API allows for integration in third-party software. FAPEC rapidly examines the files to be compressed, determining the best algorithm and configuration for each of them. It supports tabulated text files (such as CSV or point clouds), multidimensional time series and multispectral images, to name a few. Therefore, rather than a universal data compressor, FAPEC is adaptive and versatile, allowing a much more efficient use of resources.

Figure 1: Screenshot of WinFAPEC while compressing several files on a standard laptop.

On MBES datasets kindly provided by Kongsberg and Fugro (who have started using FAPEC on their vessels), FAPEC demonstrated superior performance: it achieved better compression than 7-Zip, while running 50 times faster and using 30 times less memory. Depending on the echosounder and scenario, FAPEC further reduced the file sizes (compared to 7-Zip) up to 10% for water column data, and up to 23% for combined bathymetry and water column data.

Beyond data compression

By default, FAPEC runs in lossless mode, meaning that the original files can be exactly recovered. However, for .wcd and KMall files, it also provides several lossy compression options, meaning that the quality of the data is slightly degraded to achieve a better compression. For KMall bathymetry (soundings), it allows for an instrumentally lossless operation, just removing the measurement noise. The seabed image samples can be quantized at a level indicated by the user, and can even be mostly removed if not needed. A similar approach is provided for older (.wcd) water column files.

For KMall water column data, besides sample quantization, FAPEC also provides a smart lossy mode, which examines the sample values and removes those below a given percentile. This makes it possible to keep most of the features in the water column (including sub-bottom data) while vastly improving the compression ratio. In the specific example shown in the FARSHY screenshots, the combined bathymetry and water column KMall file is 933MB, which is reduced to 410MB in lossless mode, and just 154MB with these lossy options. When adequately adjusted, water column files can become even smaller than bathymetry files while retaining most of the information.

FAPEC achieves these results by knowing the data format and examining the values. It can provide basic data analytics on the fly, namely small CSV-like text files with a digest of the file contents. For example, it generates a water column features index which aims to indicate sudden changes in the scene such as those created by gas seeps, fish shoals or shipwrecks. In the same KMall example file, this digest is just 391KB (105KB compressed), and with a simple Python script we can create interesting plots similar to those shown here.

Figure 2: Plots obtained with Python from on-the-fly basic data analytics provided by FAPEC, from soundings (left: beam width, depth and coordinates) and water column data (right: depth and features index).
 

Water column visualization made simple and fast

Scanning multiple water column files to look for potential features can be slow and cumbersome with the usual processing tools, although they are obviously essential when high accuracy and detailed analysis are required. DAPCOM’s FARSHY offers an appealing solution for rapid water column examination, as it allows for a fast loading and exceptionally fast browsing of numerous pings in a line file. Its simplicity is also attractive for students and newcomers. Like FAPEC, it inherits some aspects of space technology – namely, the visualization of multiband satellite imagery.

Figure 3: Screenshot of FARSHY showing gas seeps in the water column (left) and the along-track view (top right). (Data courtesy: Fugro)

FARSHY is fully implemented in Java, meaning that it can be used in Windows, macOS and Linux. It currently supports Kongsberg’s KMall water column files (either raw or compressed with FAPEC), generating the usual fan-shaped view for individual swaths or in a stacked mode. Additional features include an along-track view, sample value histograms and a spectrum-like visualization of all sample values for all swaths at given positions of the water column. GPS coordinates, bottom location and depth are also calculated based on the associated bathymetry. In collaboration with Kongsberg Discovery, we have implemented a feature that makes it possible to send target GPS coordinates directly to SIS, where they appear in the geographic display as user objects.

Click here for a video showing a quick demo of the FARSHY software.

 

Figure 4: Same water column file after lossy compression by FAPEC, reaching a compression ratio of six while showing an even clearer view of the gas seeps.

Another interesting feature of FARSHY is its mixed colours view, which can also be invoked from the command line, enabling batch processing of all KMall or KMwcd files within a directory. This functionality allows users to generate a single image per line, providing a quick overview of features present in the water column. Besides static images, FARSHY can also generate GIF movies from a selected range of swaths.

Click here for an example GIF movie from the water column file with the gas seeps.

Conclusion

We have presented new software tools to streamline the handling and analysis of large datasets from MBES systems, with a particular focus on water column files. FAPEC offers excellent compression ratios at outstanding speeds, making it suitable even for ARM-powered autonomous vehicles and remote vessels, where optimizing the communications channel is critical. Its unique combination of lossy compression (or noise removal) and basic data analytics capabilities paves the way for continuous and systematic MBES water column acquisition. FARSHY provides rapid, portable visualization and analysis capabilities for KMall water column files, including batch processing, SIS integration and fast swath browsing.

More information

dapcom.es

Remote Sens. 2022, 14(9), 2063

Intl. Jour. Remote Sens. 2017, 39(7), 2022–2042

Figure 5: Mixed colours view of FARSHY for the same water column, from the original lossless file (top) or the lossy-compressed file (bottom).