3 Points on Big Data

Big data is a big deal. And understandably so – it is changing virtually every aspect of our lives. And agriculture is no exception.

We get this. We do it. Everyday.

GEOSYS makes close to 1 petabyte (PB) of satellite imagery data available, live, real-time. That’s the equivalent to 1,000,000 pickup trucks filled with paper that you can search through in milliseconds. This is what happens when you’ve been in business for nearly 30 years and specialize in geographic data processing for agriculture. You accumulate a lot of bytes.

And it’s growing daily because each day we’re processing around 300 GB of new data coming in – and we anticipate that will double in the coming months.

While a lot of companies are storing huge amounts of data, GEOSYS is unique in the fact that we’ve put our focus on storing clean and organized data. This is how we’re able to make it accessible in milliseconds with the simple click of a button.

Whether your business is getting into big data* or you’re just an active user of big data, here are three things you should keep in mind:

1. Big data requires big storage
You can have access to all the data in the world but having adequate storage is essential. As TechTarget noted in a recent article “the challenge is to store this data, which is notably different in both type and quantity from traditional storage data. The good news is storage is getting easier with cloud based systems.” Yet you still need to have huge iops and know the quality of the hardware, so selecting vendors is a thorough and thoughtful process.

2. You need serious processing power.
You can’t take in 300GB of new data daily and just dump it into storage. It has to be processed so it can be easily accessible. GEOSYS uses a proprietary method to automatically intercalibrate data coming from different sources and to create indexes for satellites images pixels. This allows customers to use real physical measures inside their processes for comparisons, benchmarking or modeling in real-time.

Our combination of state-of-the-art open-source and proprietary database and data processing engines are customized to fulfill the needs of the different Ag industries – from field to elevator level analysis as well as USDA crop districts for evaluating local supply and potential impact on basis price in order to help with risk management decisions.

One example of technology we use is MongoDB which stores fields observation and sensors records. Spark is also used for scalable processing of weather data and satellite imagery data.

3. Data must be fast and actionable
Big data is no good if it takes hours (or worse, days) to access the data. We have 15 years of historical satellite data and 30 years of weather data. We use the historical data as a reference point for current data, so we have people accessing that data 24/7 because our customers span the globe. And existing or future user can access data from our cloud-based system on a continent down to a field level within seconds.

The entire breadth of the GEOSYS historical data is accessible on the fly: meaning maps of stats for a field or region can be processed in a matter of milliseconds. The speed at which agronomists and agribusiness stakeholders can access and analyze the full imagery historical data is a true differentiator and transformative in the ag-tech space.

The Internet of Things is adding another layer of data points with different levels of quality and only increases the need for clean data (in order to prevent the same fate as our old yield maps binders).

There is a lot of speculation for what the future of big data holds for agriculture, but managing that big data is going to be essential in order to avoid big headaches for users and keepers of the data alike.

*If you need to learn more about tools to help your company with big data, check out this story from TechRepublic.