Big data in hydrography
People, processes and modern workflows
Hydrographers are swimming in a vast ocean of data. Whether from ENC databases, weather satellites, buoys monitoring ocean conditions or sonar pings from the latest bathymetric survey, data is constantly flowing. While this is a lot of data, does it constitute a ‘big data’ challenge? Big data usually refers to datasets that are too large and complex to be analysed using traditional methods such as SQL databases and queries. Whether or not data is considered ‘big’ is largely determined by its volume (how much is there?), its velocity (how fast is it coming in?), its variety (how many kinds of data and how is it structured?), its veracity (is the data trustworthy?) and, last but not least, its value (what can this data tell us and can these answers drive action?). These are referred to as the 5Vs of big data.
The 5Vs can be used to determine whether hydrographers are working on a big data problem. Regarding existing ENC databases and data collection required by the new S-100 standards, as well as the various buoy and survey data coming in, it is clear that the volume box is checked. Likewise, the velocity box. NOAA, a single entity, estimates that it collects 20 terabytes of data every single day. There is also no shortage of variety: data comes from structured sources such as databases as well as unstructured or semi-structured sources such as outputs from buoys, weather satellites and shipborne systems. Data is typically collected from government-owned or third-party verified sensors, so this checks the veracity box. Lastly, can analysing and understanding this data provide value? There is certainly value in being able to recognize weather or sedimentation patterns in terms of safety of navigation, but there is also financial value as most global commerce involves moving goods via waterways or open oceans. In fact, the World Bank estimates that 80% of the goods traded globally are shipped via the sea.
Non-traditional storage methods
As hydrographers are dealing with this big data challenge, they need to turn to non-traditional storage methods, leverage breakthroughs in AI and machine learning and work with software packages that make it easier to gain insights and derive value from the data. Beside the 5Vs, big data collection has an added challenge in the maritime environment. Salt water wreaks havoc on electronics, storms can unmoor buoys and sensors often need to be placed in remote locations to collect the most meaningful data. To ensure data is being collected correctly, a routine maintenance strategy needs to be implemented that systematically reviews the sensors for optimal functionality. After all, even the best sensors require calibration and may record anomalies and invalid data outside the maintenance schedule. With these challenges in mind, how can hydrographic agencies take advantage of this ocean of data and massive network of sensors?
To realize the most value from big data, hydrographic agencies must take the time to create a data governance strategy. These are formal documented procedures to control data quality and ensure that data is being collected, stored and organized correctly. Strategies can vary depending on many factors: size of organization, amount of data collected, number of sensors or storage (cloud or on-site), just to name a few. Organizations should also seek to answer questions such as: What data needs to be collected? Does the organization need to collect it or can it be retrieved/shared from another entity? What devices will be used to collect data? How long does this data need to be stored? What standards does the data need to accommodate? Answering questions such as these will help shape policy and make it easier to implement because people will better understand why the data is needed and which IHO standard it needs to meet.
Data governance strategy revisited
A data governance strategy is not a static list of protocols and products. It should be revisited each year to accommodate changes in technology or data needs. After all, what may have been impossible last year may be possible today. Hydrographic organizations have a bit of a head start, as the S-100 standards outline which datasets and attributes are needed for compliance with the standard. S-129 (under keel management) and S-104 (tidal information for surface navigation) will need data from various sensors to ensure safety of navigation, which is inherently a big data problem that can be mitigated by creating a data governance strategy that controls the flow and volume of data.
A major control component of the data governance strategy will be deciding where to store this incoming data. Historically, this data has been stored on servers behind the organization firewall. This can still be a viable way to work, provided the IT department has the skills and time to maintain and spin up new servers as demand for access and storage grows. The hard truth is that many organizations do not have the skills or the time to continually manage and monitor big data servers. This is when cloud storage options become more attractive. AWS (Amazon Web Services) S3, Azure BLOB and Google Cloud Storage are some of the more common options. NoSQL databases such as MongoDB are also popular. This highlights the need for a data governance strategy as these options are not one-size-fits-all and it will be a process to select what is best for the organization. Investing time and money into proofs of concept would be beneficial in the long run to determine which platform, storage options and database structure is best and fit-for-purpose.
The best strategy and storage solutions are bound to stay on the shelf if there is not someone to adjust the strategy to meet changing business needs and someone to implement the daily details. Organizations are turning towards two very important roles: the data owner and the data steward. The data owner is responsible for the overall data governance strategy and defines the policies and procedures. They work at the executive level to secure buy-in and ensure that the strategy is meeting the business needs of the organization. The data steward is responsible for the implementation of said governance strategy and policies. These roles must work together to ensure successful data management. Large agencies may employ several of both roles. A full discussion of these roles and responsibilities is beyond the scope of this article, but organizations must be aware that they may need to develop or hire personnel to take on these important functions.
Data analysis
One of the tenets of big data is that it is coming in too fast and is too large to be analysed by traditional methods, so what options do hydrographic organizations have to begin analysing this data? First, there is the challenge of separating the valuable information from the noise. Organizations will generally need to employ a few solutions depending on data type and volume. Google BigQuery, Amazon Redshift and Snowflake are the big three for cloud-based computing and data warehouses. These support automation through API integration and are stable and secure platforms for big data analytics. They have also been purpose-built to work with their storage counterparts. They do however require knowledge of SQL or Python to use effectively. Solutions such as ArcGIS Velocity, available to those using the Esri stack, provide a GUI-based platform for those who do not currently have the SQL or Python skills and is great place to start.
ArcGIS Velocity and similar solutions also allow organizations to incorporate their spatial data holdings to create spatial filters for the influx of data. This limits the amount of noise by essentially clipping it off for a specified area, such as an exclusive economic zone (EEZ). Apache Hadoop and Spark are frequently mentioned when looking to choose a framework for processing. Hadoop is used in parallel processing applications where massive amounts of data would otherwise max out resources on a single machine. Spark is an in-memory processing engine that is leverage as a stand-alone instance. As an alternative, software suites such as Esri’s ArcGIS use this for processing when running machine learning or computer vision tasks through ArcGIS Pro. The benefit here is that organizations can use what they have already purchased. Real-time data analytics are necessary to separate valuable information from noise and organizations should consider adding data scientists to their staff along with the data owners and stewards. Real-time and big data analysis is going to be essential if organizations are going to comply with S-100 standards such as S-129 and S-104.
Actionable intelligence
The greatest big data analysis means little if its results are not translated into actionable intelligence for executives or public stakeholders. Dashboards offer one of the most effective starting points. They turn complex datasets into clear, intuitive insights by employing easy-to-read charts, graphs and tables; PowerBI and Tableau are well-known examples. Organizations leveraging the Esri stack can deploy ArcGIS Dashboards, which add a spatial component that helps users not only understand what is happening, but also where – providing vital context for operational decisions.
Dashboards have two qualities that make them particularly appealing. First, they are no-code to low-code solutions designed to get information to stakeholders as quickly as possible. Hydrographers simply bring their data and plug it into the dashboard environment. Second, dashboards are interactive. Users can filter results with mouse clicks, drags or map panning and many dashboards include built-in tools to segment data into manageable, visual subsets. This means hydrographers do not need to be data scientists to benefit from big data workflows.
Pushing workflows to their limits
Survey responses from the annual Hydro International Industry Survey reinforce the urgency behind these tools. Big data is pushing hydrographic workflows to their limits. With information flowing from more sources at higher frequencies, organizations are facing a data burden that is growing faster than their ability to manage it. High-resolution sensors and multi-platform acquisition continue to expand volumes far beyond what traditional workflows can absorb.
Cloud solutions provide part of the relief. For organizations without the resources to maintain complex in-house server infrastructure, cloud storage offers flexibility and scalability. Providers such as Amazon and Google supply robust analytics environments where teams can process large datasets using SQL, Python and automated pipelines. Organizations working within the Esri ecosystem or other solutions can turn to tools such as ArcGIS Velocity for powerful no-code processing options. Dashboards remain an effective way to disseminate information to decision makers and stakeholders, enriched with spatial context when needed.
For organizations that lack such tools or established workflows, the annual Hydro International Industry Survey signals a growing need for stronger data governance practices. Rising data volumes are colliding with fragmented systems, inconsistent workflows and unclear retention policies. Developing policies and procedures that align with organizational needs helps determine which storage, analysis and dissemination options are appropriate and sustainable. Big data is a growing challenge, but one that can be addressed through planning, deliberate choices and incremental organizational change.
Yet the survey also makes clear that cloud adoption alone will not resolve the issue. Respondents describe fragmented technical environments – combinations of legacy systems, proprietary software and manual handovers – that continue to slow processing even after infrastructure improvements. Growing data volumes strain storage and archiving capacity, while client expectations for faster delivery increase the pressure.
Conclusion
In hydrography, big data is defined not just by sheer size, but by the full 5V profile — vast volumes, rising velocities, diverse formats, trusted veracity and clear operational value. Maritime conditions, distributed sensors and S-100 requirements amplify the challenge, making governance every bit as important as technology. Agencies must pair strong data management roles with cloud storage, scalable analytics and tools that separate signal from noise. Whether through SQL-based platforms, Hadoop/Spark frameworks or no-code options such as ArcGIS Velocity, the objective remains the same: turning an overwhelming flow of information into dependable, actionable insight. Ultimately, big data in hydrography is a structural challenge that demands coordinated investment in people, processes and modern workflows. The organizations that address all three will be ready for the data demands of the coming decade.

Value staying current with hydrography?
Stay on the map with our expertly curated newsletters.
We provide educational insights, industry updates, and inspiring stories from the world of hydrography to help you learn, grow, and navigate your field with confidence. Don't miss out - subscribe today and ensure you're always informed, educated, and inspired by the latest in hydrographic technology and research.
Choose your newsletter(s)























