Skip to content

JeDeveloper/qscout

Repository files navigation

The QScout suite is a collection of interacting QGIS Processing plugins for georeferencing and analyzing field scouting data. The plugins are run within QGIS on imported scouting data. QGIS is a Geographic Information Systems program similar to ArcGIS.

  • The Pin Dropper plugin takes data from a spreadsheet and puts them on a crop field.
  • The Pin Locator plugin takes data from a crop field and gives row and column coordinates within the field so you can understand where they are in relation to the map.
  • The Value Grabber plugin takes data on a map in the form of points, and attaches pixel values from a raster file to them.
  • The Grid Aggregator plugin takes data on a map and groups them together so you can do math on it more easily.

In this documentation, parameter names are in italics, and code is in monospaced typewriter font.

The program can be cloned from this repository or downloaded from the QGIS plugin manager (eventually). Check out this tutorial for instructions on how to use the plugin manager.

Drop Pins / Locate Pins in Field

Abstract

Drop Pins (Processing: qscout:droppins) is a plugin for georeferencing field data with a particular focus on vinyards. The plugin can also be used to drop points on a field if no data is available.

Locate Pins in Field (Processing: qscout:locatepinsinfield) effectively does the opposite of Drop Pins. Given a vector layer of points, the plugin will produce a copy of the layer with row and plant numbers added.

These two algorithms largely use the same parameters so are grouped together.

Usage Guide

The minimum required to run the plugin is a Bounding Polygon, a Row Vector, and values for Row Spacing and Point Interval. The Bounding Polygon is the bounderies of the area which the program will drop within. It does not have to be a rectangle but can be any polygon. The Row Vector is a line drawn along a row. The program uses the Row Vector to understand the layout of the area. The program will assume all rows are parallel to the Row Vector. The length of the row vector does not matter, only the direction. If the row vector has more than two points, the plugin will ignore all but the first and last point.

In order to assign data to dropped points, Drop Pins requires an Input Data file. Currently, the only format supported is .csv. Excel, Google Docs, OpenOffice, and any other spreadsheet software will allow you to save files in the .csv format. The order of the columns in the file does not matter - the program will automatically search for columns with headers with names like 'Row' and 'Column' and use those to georeference the data. All other columns will be included as fields in the Output Layer unless you specify which fields to use with Fields to Use parameter. If your data describes the locations of plants in relation to the panel number in the row, use the Panel Size parameter to tell the plugin how many plants are in a panel. A data file is not required for Locate Pins in Field.

Locate Pins in Field requires Points to Index, which is a vector layer

The Start Corner parameter helps the program understand how row and plant numbers translate to points on a map. The corners of the field are determined from the Row Vector, which is assumed to point right to left. On a clock face, if the first point of the row vector is at the center of the clock, the last point of the row vector is at 3:00 (right), and top, bottom, and left are at 12:00, 6:00, and 9:00 respectively.

The Raster Layer, Match Threshold and Rate Offset Match Function allow the program to drop points in a 'smarter' way. If Rate Offset Match Function is set to a value other than Regular, the program will attempt to find plants using the provided Raster Layer.

Parameters Reference

Basic Parameters

  • Targeting Raster (Processing: TARGETING_RASTER_INPUT): The input raster for the program. Not required if Rate Offset Match Function is set to 'Regular'. IMPORTANT: the input raster must have the same CRS as the Bounding Polygon
  • Bounding Polygon (Processing: BOUND_POLYGON_INPUT): A layer containing a polygon that the program will drop pins within.
  • Row Vector (Processing: ROW_VECTOR_INPUT): A direction vector, which the program takes in the form of a line, representing a row in the field. The first point in the line is the start point for the field, so this is also implicitly a position vector. Don't overthink this - just find a place where the raster is a clear pattern and draw a line along a row. If the CRS is different from BOUND_POLYGON_INPUT it will be automatically converted.
  • Input Data (Processing: DATA_SOURCE_INPUT): A csv file containing the data to georeference. If no file is provided, the program will drop a pin on everything it thinks is a plant. If a file is provided, the program will only drop pins on features described in the file.
    Only used by Drop Pins.
  • Drop Data-Less Points (Processing: DROP_DATALESS_POINTS_INPUT): Whether the program will drop points on plants that don't have any information provided in Input Data. If Input Data is not provided, this will be treated as True.
    Only used by Drop Pins.
  • Row Spacing (Processing: ROW_SPACING_INPUT): The distance between two rows, in the units of the CRS used by Bounding Polygon.
  • Point Interval (Processing: POINT_INTERVAL_INPUT): The interval between points on a row. Functions similar to row height.
  • Match Threshold (Processing: OVERLAY_MATCH_THRESHOLD_INPUT): A value from 0.000 to 1.000. The threshold at which to declare an overlay box a match and drop a pin. How this number is applied depends on which Rating Function has been selected. The default value is completely arbitrary and has absolutely no mathematical or scientific significance.
  • Start Corner (Processing: START_CORNER_INPUT): The corner of the field where the numbering starts. You would find row 1, plant 1 in this corner. For a better understanding of what "Top", "Bottom", "Left", and "Right" mean in this context, see the Usage Guide.
  • Points to Index (Processing: POINTS_INPUT): In Locate Pins in Field, the pins to assign row and plant number values to.
    Only used by Locate Pins in Field.
  • Dropped Pins (Processing: DROPPED_PINS_OUTPUT): The layer or file where the program will output the dropped points. Leave blank to generate a new layer.
    Only for Drop Pins.
  • Indexed Points (Processing: INDEXED_POINTS_OUTPUT): The layer or file where the program will output the points with field coordinates (row, plant). Only for Locate Pins in Field.

Advanced Parameters

  • Rate Offset Match Function (Processing: RATE_OFFSET_MATCH_FUNCTION_INPUT: the function used to identify points as plants. See the Advanced Use Guide for more information.
  • Compare from Root (Processing: COMPARE_FROM_ROOT_INPUT): If set to True, the Rate Offset Match Function> will use the root point (the one at the beginning of the Row Vector for comparisons rather than a neighboring point.
  • Fields to Use (Processing: DATA_SOURCE_FIELDS_TO_USE): A comma-seperated list of the columns in the csv provided in Input Data to express as fields in the features in the Output Layer. If left blank, all columns will be converted to Output Layer fields.
    Only for Drop Pins.
  • Panel Size (Processing: PANEL_SIZE_INPUT): The size of the panels in the field. Used for analysis of Input Data. For more information, see the Advanced Use Guide.
    Only for Drop Pins.
  • Overlay Box Radius (Processing: OVERLAY_BOX_RADIUS_INPUT): the radius of the box that the program will use for its comparisons, in field units (i.e. the height and interval values specified in the above section). Defaults to 2, which means 2 units AROUND the spot where the program is considering dropping a pin.
  • Maximum Patch Size (Processing: PATCH_SIZE_INPUT): The largest size of hole to fill when patching holes. I'm not explaining this very well. Set to 0 for no hole patching.
  • Row Spacing Stdev (Processing: ROW_SPACING_STDEV_INPUT): the standard deviation of the row height values. The program will assume a gaussian distribution and look within three standard deviations.
  • Point Interval Stdev (Processing: POINT_INTERVAL_STDEV_INPUT): the standard deviation of the interval between points on a row. The program will assume a gaussian distribution and look within three standard deviations.
  • Search Iteration Size (Processing: SEARCH_ITERATION_SIZE_INPUT): The size of the sides of the search box used when searching an area to drop a pin, in points. The number of points checked per iteration will be this value squared. The side lengths in crs units are determined by the Row Spacing Stdev and Point Interval Stdev. Increasing this value will exponentially increase search time, porportionally to Number of Search Iterations. Has no significance when using the Rate Offset Match Function 'Regular'.
  • Number of Search Iterations (Processing: SEARCH_NUM_ITERATIONS_INPUT): The number of times the program will zoom in on an area and search in greater detail when attempting to drop a pin. Increasing this value may increase precision but will also increase execution time linearly in proportion to the square of Search Iteration Size. Has no significance when using the Rate Offset Match Function 'Regular'.
  • Precision Bias Coefficient (Processing: PRECISION_BIAS_COEFFICIENT_INPUT): If nonzero, the result of Rate Offset Match Function will be divided by this value times the square of the deviation from the expected location when dropping pins. A higher value will cause the program to favor dropping pins near where it expects them to be. For more information, see the Advanced Use Guide. Has no significance when using the Rate Offset Match Function 'Regular'.

Advanced Use Guide

More advanced guide coming soon

Rating Functions

All these functions were developed by me. I'd be very interested to see what an actual expert would come up with.
Regular: This is the default rating function. It ignores any provided Raster Layer and drops points at regular intervals. It's by far the fastest rating function because it doesn't actually do any rating.
Local Normalized Difference: Each pixel value for each raster band is "normalized" by dividing it by the range of values for that band within the sample. The average difference between normalized values in the two samples is compared to get the match value.
Global Normalized Difference: Each pixel value for each raster band is "normalized" by dividing it by the range of values for that band within the entire raster. The average difference between normalized values in the two samples is compared to get the match value.
Absolute Difference: The average difference between the two samples is divided by 255.
Relative Match Count: Counts the number of pixels where the normalized values in the two samples are within 0.1 of each other. The count of relative matches is divided by the total number of pixels in the sample.
Gradients: The program calculates how much the pixel values are changing at each pixel, for each band. (Should the gradients be calculated based on normalized samples?) The average difference between the two gradient matrices is divided by 255. This is by far the most computaitonally intensive of the rating functions.
Random: Drops points randomly within the Row Spacing Stdev and Point Interval Stdev. I wrote this function during the development process to test if the other functions perform better than random chance, and have included it here on the off chance that someone may find an application for it.

FAQ

Troubleshooting

Problem: "My (1,1) point isn't in the correct corner."
Solution: You may have drawn the row vector in the opposite direction from what you meant. The row vector points left to right towards what would be 3:00 on a clock face. To fix this problem, you can either redraw the row vector or imagine that the entire map is rotated by 180 degrees.

Problem: "Error message reading 'ValueError: some errors were detected!' followed by a series of messages reading something akin to 'Line #x (got y columns instead of z)'."
Solution: The first step is to check that your lines all have the same number of values. The .csv file can be opened in a text editor. Every row should have the same number of values, seperated by commas. If that doesn't work, check if your column headers have any special characters, such as "$", "#", "%", and remove those characters if you find them. If that doesn't solve the problem, email me.

Problem: "Empty data values from some columns in my input csv file are set to -1 as attributes."
Solution: In your spreadsheet software, set any value in the column causing the problem to a decimal, e.g. 10 becomes 10.0. You don't need to do this for every value in the column, just one.

Let me know if you have any more questions, which will then become 'frequently asked'

Grid Aggregator

Abstract

This program takes a point layer and creates a polygon layer grid, then sets the values of the cells of the grid based on which points fall within the borders. The grid will be oriented along the axes of the CRS of the points layer. (I think this always means N/S/E/W? Do some CRSs use other axes?)

Parameter Reference

Basic Parameters

    Points (Processing: POINTS_INPUT): The points to aggregate in the grid.
  • Grid Cell Width (Processing: GRID_CELL_W_INPUT): The width of the grid cells.
  • Grid Cell Height (Processing: GRID_CELL_H_INPUT): The height of the grid cells.
  • Fields to Use (Processing: FIELDS_TO_USE_INPUT): The fields to aggregate in the grid. Each field will be seperately aggregated.
  • Aggregation Function (Processing: AGGREGATION_FUNCTION_INPUT): The function to use to aggregate the values into the grid. Most of the options are likely self-explainatory but a detailed description can be found in the Advanced Use Guide.
  • Aggregate Grid (Processing: AGGREGATE_GRID_OUTPUT): The output file or layer where the grid will be created.

Advanced Parameters

  • Custom Aggregation Function (Processing: CUSTOM_AGGREGATION_FUNCTION_INPUT): A file containing a custom aggregation function. Only for use by people experianced with Python. An example custom aggregation function script with more notes can be found in this repository, named example_aggregate_function.py.
  • Grid Extent (Processing: GRID_EXTENT_INPUT): Optional: the extent to draw the grid. An extent specified with an alternative CRS will be automatically converted. If left unspecified, grid extent will be automatically calculated from the extent of the Points parameter.

Advanced Use Guide

Aggregation Functions

The built-in aggregation functions are Mean Average, Median Average, Sum, Standard Deviation, and Weighted Average. Most of these are pretty self-explainatory. Weighted Average returns the average of the point values weighted by the distance between the point and the center of the grid cell. (I'm going to need to explain this better at some point)

In addition to the built-in aggregation functions, python-savvy users can pass the plugin a custom aggregation function. This is explored in the next section.

Custom Aggregation Functions

If the user would like to aggregate pin values using a mathematical method not included in the prepackaged aggregation functions, they can specify their own custom aggregation function by passing a python file in the Custom Aggregation Function parameter.

A custom aggregation function python file should contain a class named Aggregator which implements the following functions:

  • __init__(self, context) The processing plugin will pass itself to the constructor so that the aggregation function wrapper object can query attributes of the processing plugin.
  • manual_field_ag(self), a function which returns a constant boolean value which is False if the aggregation function will return modified versions of the fields in the point layer passed to the plugin and True if the aggregation function will flatly return values for the fields specified in self.return_vals(). I'm not explaining this very well.
  • return_vals(self), a function which takes no arguements and returns a list of length at least 1, containing 2-length tuples. Each tuple should contain the name of a field that this aggregation function will produce and the datatype, as a QVariant field type (QVariant.Int, QVariant.Double, or QVariant.String).
  • aggregate(self, cell), the function that actually aggregates point values. This function accepts an instance of GridGrabberCell (for details on GridGrabberCell, see below) and, if self.manual_field_ag() is False, should return a list with length len(self.return_vals()) * cell.attr_count()). If self.manual_field_ag() is True, the list should be of the same length as self.return_vals() and ordered in the same way.
Tip: if you run the program and there are no errors but the Attribute Table in QGIS is empty, you probably passed something the wrong data type.

Value Grabber

Parameter Reference

Basic Parameters

  • Points Input (Processing: POINTS_INPUT): A vector layer of points. The layer will be duplicated and the duplicate will be returned with added fields for the bands of the raster.
  • Raster File Input (Processing: RASTER_INPUT): The raster file to grab band values from. In order to facilitate the use of large datasets which might crash QGIS, the file does NOT have to be a QGIS layer in the open project.
  • Points with Grabbed Values (Processing: POINTS_WITH_VALUES_OUTPUT): The points layer that will be returned. Can be a new layer or a file.

Advanced Parameters

  • Grab Radius (Processing: GRAB_RADIUS_INPUT): Optional. If specified, the plugin will assign each point the average raster value within a radius.
  • Grab Area Distance Weight (Processing: GRAB_AREA_DISTANCE_WEIGHT_INPUT): Optional. If this is specified, the average taken within Grab Radius will be weighted by 1 / (distance from point^2 * Grab Area Distance Weight). The raster value directly below the point will be assigned a weight of 1.
  • Grab Function (Processing: GRAB_FUNCTION_INPUT): Optional. If specified, a custom python script that will be used to grab values from the raster. Recommended for advanced users only. See Advanced Use below.

Advanced Use

Custom Grab Functions

Similarly to in Grid Aggregator, the user can pass the algorithm a python script with a custom function to grab the values to assign to the features. Unlike Grid Aggregator, a custom script for Value Grabber contains only a single function without a class wrapper. The function should be have signature def grab(coords, distances, bands, pixels, center_geo, center_raster, point_feature, context), where:

  • coords is a pair of lists of raster coords around center_raster and within context.get_pixel_radius_around(center_geo). Can be unpacked with xs, ys = coords.
  • distances is a numpy array of the distance of each coord pair in coords from center_raster, in pixel units.
  • bands is the bands to return values for. It takes the form of a boolean array the length of which is the number of bands.
  • pixels is a one-dimensional numpy array of values of pixels at the points specified in coords.
  • center_geo is the point at and/or around which the function will grab, in geographic coordinates, as an x,y tuple.
  • center_raster the point at and/or around which the function will grab, in the crs units of the raster, as an x,y tuple.
  • point_feature the QgsFeature instance (point) that shows the location being grabbed. useful if you want the grab function to use attributes of the feature.
  • context is the instance of QScoutValueGrabberAlgorithm which is executing this algorithm.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published